BT

InfoQ Homepage Articles Linkerd v2: How Lessons from Production Adoption Resulted in a Rewrite of the Service Mesh

Linkerd v2: How Lessons from Production Adoption Resulted in a Rewrite of the Service Mesh

Leia em português

This item in japanese

Bookmarks

Key Takeaways

  • Linkerd 2.0 introduced a substantial rewrite of the widely adopted service mesh, which was previously written in Scala, and was inspired by work undertaken at Twitter on the Finagle RPC system. 
  • This new version moved off the JVM to a split Go (for the control plane) and Rust (data plane) implementation.
  • The Buoyant team made deep technical investments in the underlying Rust network stack, and refocused the UX on simplicity, ease of use, and low cognitive overhead. The result was dramatically faster, lighter, and simpler to operate. 
  • It has been over six months from the launch of Linkerd 2.0, and the team believe that the rewrite has paid dividends, with many users who were previously unable to adopt the 1.x branch now happily adopting 2.x.

The service mesh is rapidly becoming a critical part of the modern cloud-native stack. Moving the mechanics of interservice communication (in datacenter parlance, “east-west traffic”) from application code to the platform layer and providing tooling around the measuring and manipulation of this communication provides operators and platform owners with a much-needed layer of visibility and control that is largely independent of application code.

While the term “service mesh” has only recently entered the industry lexicon, the concepts behind it are not new. These concepts have been in production for a decade or more at companies like Twitter, Netflix, and Google, typically in the form of “fat client” libraries like FinagleHystrix, and Stubby. From the technical perspective, the co-deployed proxy approach of the modern service mesh is more like a repackaging of these ideas from library to proxy form, enabled by the rapid adoption of containers and orchestrators like Docker and Kubernetes.

The rise of the service mesh started with Linkerd, the earliest service mesh and the project to coin the term. First released in 2016, Linkerd currently features two parallel lines of development: the original 1.x branch, built on the “Twitter stack” of Scala, Finagle, Netty, and the JVM; and the 2.x branch, rebuilt from the ground up in Rust and Go.

The launch of Linkerd 2.0 represented not just a change in underlying implementation but also a significant change in approach, informed by a wealth of lessons learned from years of production experience. This article discusses these lessons, and how they became the basis of the philosophy, design, and implementation of Linkerd 2.0.

What is Linkerd and why should I care?

Linkerd is an open-source service mesh and Cloud Native Computing Foundation member project. First launched in 2016, it currently powers the production architecture of companies around the globe, from startups like Strava and Planet Labs to large enterprises like Comcast, Expedia, Ask, and Chase Bank.

Linkerd provides observability, reliability, and security features for microservice applications. Crucially, it provides this functionality at the platform layer. This means that Linkerd’s features are uniformly available across all services, regardless of implementation or deployment, and are provided to platform owners in a way that’s largely independent of the roadmaps or technical choices of the developer teams. For example, Linkerd can add TLS to connections between services and allow the platform owner to configure the way that certificates are generated, shared, and validated without needing to insert TLS-related work into the roadmaps of the developer teams for each service.

Linkerd works by inserting transparent, layer-5/layer-7 TCP proxies around the services that the operator chooses to mesh. These proxies form Linkerd’s data plane, and handle all incoming and outgoing traffic to their services. The data plane is managed by Linkerd’s control plane, a set of processes that provide the operator with a single point for monitoring and manipulating traffic flowing through the services.

Linkerd is based on a fundamental realization: that the request traffic flowing through a microservice application is as much a part of its operational surface area as the code of the application itself. Thus, while Linkerd can’t introspect the internals of a given microservice, it can report top-line health metrics by observing the success rate, throughput, and latency of its responses. Similarly, while Linkerd can’t modify the application’s error-handling logic, it can improve service health by automatically retrying requests to failing or lagging instances. Linkerd can also encrypt connections, provide cryptographically secured service identity, perform canaries and blue/green deploys by shifting traffic percentages, and so on.

Linkerd 1.x

Linkerd was born from our experience operating one of the world’s earliest and largest microservice applications at Twitter. As Twitter migrated from a three-tiered Ruby on Rails application to a proto-cloud-native architecture built on Mesos and the JVM, it created a library, Finagle, that provided instrumentation, retries, service discovery, and more to every service. The introduction of Finagle was a critical part of allowing Twitter to adopt microservices at scale.

Linkerd 1.x was launched in 2016 and built directly on the production-tested Twitter stack of Finagle, Scala, Netty, and the JVM. Our initial goal was simply to provide Finagle’s powerful semantics as widely as possible. Recognizing that the audience for a Scala library for asynchronous RPC calls was limited at best, we bundled Finagle in proxy form, allowing it to be used with application code written in any language. Happily, the contemporaneous rise of containers and orchestrators greatly reduced the operational cost of deploying proxies alongside each service instance. Linkerd thus gained momentum, especially in the cloud-native community, which was rapidly adopting technology like Docker and Kubernetes.

From these humble beginnings grew Linkerd and the service-mesh model itself. Today, the 1.x branch of Linkerd is in active use in companies around the globe and continues to be actively developed.

Linkerd lessons learned

Despite Linkerd’s success, many organizations were unwilling to deploy Linkerd into production or were willing but had to make major investments in order to do so.

This friction was caused by several factors. Some organizations were simply reluctant to introduce the JVM into their operational environment. The JVM has a particularly complex operational surface area, and some operations teams, rightly or wrongly, shied away from introducing any JVM-based software into their stack — especially one playing a mission-critical role like Linkerd.

Other organizations were reluctant to allocate the system resources that Linkerd required. Generally speaking, Linkerd 1.x was very good at scaling up — a single instance could process many tens of thousands of requests per second, given sufficient memory and CPU — but it was not good at scaling down: it was difficult to get the memory footprint of a single instance below a 150-MB RSS. Scala, Netty, and Finagle worsened the problem as they were all designed to maximize throughput in resource-rich environments, i.e. at the expense of memory. 

Since an organization might deploy hundreds or thousands of Linkerd proxies, this footprint was important. As an alternative, we recommended that users deploy the data plane per host rather than per process, allowing users to better amortize resource consumption. However, this added operational complexity, and limited Linkerd’s ability to provide certain features such as per-service TLS certificates.

(More recent JVMs have improved these numbers significantly. Linkerd 1.x’s resource footprint and tail latency are greatly reduced under IBM’s OpenJ9, and Oracle’s GraalVM promises to reduce it even further.)

Finally, there was the issue of complexity. Finagle was a rich library with a large feature set, and we exposed many of these features more or less directly to the user via a configuration file. As a result, Linkerd 1.x was customizable and flexible but had a steep learning curve. One design mistake in particular was the use of delegate tables (dtabs) — a backtracking, hierarchical, suffix-preserving routing language used by Finagle — as a fundamental configuration primitive. Any user who attempted to customize Linkerd’s behavior would quickly run into dtabs and have to make a significant mental investment before being able to proceed.

Fresh start

Despite Linkerd’s rising level of adoption, we were convinced by late 2017 that we needed to re-examine our approach. It was clear that Linkerd’s value propositions were right, but the requirements it imposed on operational teams were unnecessary. As we reflected on our experience helping organizations adopt Linkerd, we settled on some key principles of what the future of Linkerd should look like:

  1. Minimal resource requirements. Linkerd should impose as minimal a performance and resource cost as possible, especially at the proxy layer.
  2. Just works. Linkerd should not break existing applications, nor should it require complex configuration simply to get started.
  3. Simple. Linkerd should be operationally simple with low cognitive overhead. Users should find its components clear and its behavior understandable.

Each of these requirements posed its own set of challenges. To minimize system resource requirements, it was clear we would need to move off of the JVM. To “just work”, we would need to invest in complex techniques such as network protocol detection. Finally, to be simple — the most difficult requirement — we would need to explicitly prioritize minimalism, incrementality, and introspection at every point.

Facing a rewrite, we realized we would have focused on a concrete initial use case. As a starting point, we decided to focus purely on Kubernetes environments and on the common protocols of HTTP, HTTP/2, and gRPC — while understanding that we would later need to expand all of these constraints.

Goal 1: Minimal resource requirements

In Linkerd 1.x, both the control plane and the data plane were written for the same platform (the JVM). However, the product requirements for these two pieces are actually quite different. The data plane, deployed alongside every instance of every service and handling all traffic to and from that service, must be as fast and as small as possible. More than that, it must be secure: Linkerd’s users are trusting it with incredibly sensitive information, including data subject to PCI and HIPAA compliance regulations.

On the other hand, the control plane, deployed to the side and not in the critical path for requests, has more relaxed speed and resource requirements. Here, it was more important to focus on extensibility and ease of iteration.

From early on, it was clear that Go was the right language for the control plane. While Go had a managed runtime and garbage collector like the JVM, these were tuned for modern network services and did not impose even a fraction of the cost we saw from the JVM. Go was also orders of magnitude less operationally complex than the JVM and its static binaries, memory footprint, and startup times were a welcome improvement. While our benchmarks showed that Go was still slower than natively compiled languages, it was fast enough for the control plane. Finally, Go’s excellent library ecosystem gave us access to a wealth of existing functionality around Kubernetes and we felt that the language’s low barrier to entry and relative popularity would encourage open-source contribution.

While we considered Go and C++ for the data plane, it was clear from the outset that Rust was the only language that met our requirements. Rust’s focus on safety, especially its powerful borrow checker, which enforced safe memory practices at compile time, allowed it to sidestep a whole class of memory-related security vulnerabilities, making it far more appealing than C++. Its ability to compile to native code and its fine-grained control of memory management gave it a significant performance advantage over Go and better control of memory footprint. Rust’s rich and expressive language appealed to us Scala programmers, and its model of zero-cost abstractions suggested that (unlike with Scala) we could make use of that expressivity without sacrificing safety or performance.

Rust did suffer from one major downside, circa 2017: its library ecosystem significantly lagged behind those of other languages. We knew that the choice of Rust would also require a heavy investment in networking libraries.

Goal 2: Just works

With the underlying technology choices set, we moved on to satisfying the next design goal: Linkerd should just work. For Kubernetes applications, this meant that adding Linkerd to a pre-existing functioning application couldn’t cause it to break nor could it require configuration beyond the bare minimum.

We made several design choices to satisfy this goal. We designed Linkerd’s proxies so that they were capable of protocol detection: they would proxy TCP traffic, but could automatically detect the layer-7 protocol used. Combined with iptables rewiring at pod creation time, this meant that application code making any TCP connection would transparently have that connection proxied through its local Linkerd instance, and if that connection used HTTP, HTTP/2, or gRPC, Linkerd would automatically alter its behavior to layer-7 semantics — e.g., by reporting success rates, retrying idempotent requests, load balancing at the request level, etc. This was all done without requiring configuration from the user.

We also invested in providing as much functionality as possible out of the box. While Linkerd 1.x provided rich metrics on a per-proxy basis, it left the aggregation and reporting of these metrics to the user. In Linkerd 2.0, we bundled a small, time-bounded instance of Prometheus as part of the control plane so that we could provide aggregated metrics in the form of Grafana dashboards out of the box. We used these same metrics to power a set of UNIX-style commands that allow operators to observe live service behavior from the command line. Combined with the protocol detection, this meant that platform operators could get rich service-level metrics from of Linkerd immediately, without configuration or complex setup.

Fig. 1: Inbound and outbound TCP connections to/from application instances are automatically routed through the Linkerd data plane (“Linkerd-proxy”), which in turn is monitored and managed by the Linkerd control pane.

Goal 3: Simple

This was the most important goal, even though it was in tension with the goal of ease of use. (We’re indebted to Rich Hickey’s talk on simplicity versus easiness for clarifying our thinking on this matter.) We knew that Linkerd was an operator-facing product — i.e., as opposed to a service mesh that a cloud provider operates for you, we expect you to operate Linkerd yourself. This meant that minimizing Linkerd’s operational surface area was of paramount concern. Fortunately, our years of helping companies adopt Linkerd 1.x gave us concrete ideas about what this would entail:

  • Linkerd couldn’t hide what it was doing or feel overly magical.
  • Linkerd’s internal state should be inspectable.
  • Linkerd’s components should be well-defined, discrete, and clearly demarcated.

We made several design choices in service of this goal. Rather than unifying the control plane into a single monolithic process, we split it at its natural boundaries: a web service powers the web UI, a proxy-API service handles communication with the data plane, and so on. We exposed these components directly to the user on Linkerd’s dashboard, and we designed the dashboard and CLI UX to fit into the idioms and expectations of the larger Kubernetes ecosystem: the linkerd install command emits a Kubernetes manifest in YAML form to be applied to the cluster via kubectl apply, the Linkerd dashboard looks and feels like the Kubernetes dashboard, and so on.

We also avoided complexity by exercising restraint. We operated on core Kubernetes nouns like deployments and pods, rather than introducing our own definitions of what a “service” was. We built on existing Kubernetes features like secrets and admission controllers whenever possible. We minimized our use of custom resource definitions because we knew that these added significant complexity to the cluster. And so on.

Finally, we added extensive diagnostics, allowing operators to inspect Linkerd’s internal state and validate its expectations. We meshed the control plane (i.e., each control-plane pod has a data-plane sidecar proxying all traffic to and from it), allowing operators to use Linkerd’s rich telemetry to understand and monitor the state of Linkerd, just as they do their own applications. We added commands like linkerd endpoints, which dumps Linkerd’s internal service-discovery information, and linkerd check, which verifies that every aspect of a Kubernetes cluster and a Linkerd installation is operating as expected.

In short, we did our best to make Linkerd explicit and observable rather than easy and magical.

Fig. 2: The dashboard in Linkerd 2.0 mimics the look and feel of the Kubernetes dashboard, from the visual treatment to the navigation.

Linkerd 2.0 today

We launched Linkerd 2.0 in September 2018, approximately a year after beginning the efforts internally. While the value propositions were fundamentally the same, our focus on ease of use, operational simplicity, and minimal resource requirements resulted in a substantially different product shape from 1.x. Six months in, this approach has paid dividends, with many users who were previously unable to adopt the 1.x branch now happily adopting 2.x.

Our choice of Rust has garnered significant interest; while this was originally something of a gamble (in fact, we released an early version under the name “Conduit”, afraid to tarnish the Linkerd brand), it is clear by now that the gamble has paid off. Since 2017, we’ve made significant investments in core Rust networking libraries such as Tokio, Tower, and Hyper. We’ve tuned Linkerd 2.0’s proxy (called, simply enough, “linkerd2-proxy”) to efficiently free the memory allocated to a request when the request terminates, allowing for incredibly sharp latency distributions as memory allocation and de-allocation is amortized across request flow. Linkerd’s proxies now feature a p99 latency of less than a millisecond and a memory footprint of well under 10 MB, an order of magnitude smaller than Linkerd 1.x.

Today, Linkerd has a thriving community of adopters and contributors, and the future of the project is bright. With 50+ contributors to the 2.x branch, weekly edge releases, and an active and friendly community Slack channel, we are proud of our efforts and look forward to continuing to solve real-life challenges for our users while staying true to our design philosophy of simplicity, ease of use, and minimal resource requirements.

About the Author

William Morgan is the co-founder and CEO of Buoyant, a startup focused on building open-source service-mesh technology for cloud-native environments. Prior to Buoyant, he was an infrastructure engineer at Twitter, where he helped move Twitter from a failing monolithic Ruby on Rails app to a highly distributed, fault-tolerant microservice architecture. He was a software engineer at Powerset, Microsoft, and Adap.tv, a research scientist at MITRE, and holds an MS in computer science from Stanford University.

Rate this Article

Adoption
Style

Hello stranger!

You need to Register an InfoQ account or or login to post comments. But there's so much more behind being registered.

Get the most out of the InfoQ experience.

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Community comments

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

BT

Is your profile up-to-date? Please take a moment to review and update.

Note: If updating/changing your email, a validation request will be sent

Company name:
Company role:
Company size:
Country/Zone:
State/Province/Region:
You will be sent an email to validate the new email address. This pop-up will close itself in a few moments.