BT

Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ

Topics

Choose your language

InfoQ Homepage Articles The Silent Platform Revolution: How eBPF Is Fundamentally Transforming Cloud-Native Platforms

The Silent Platform Revolution: How eBPF Is Fundamentally Transforming Cloud-Native Platforms

This item in japanese

Key Takeaways

  • eBPF is already used by many projects and products around the cloud-native ecosystem “under the hood” because it makes the kernel ready for cloud-native computing by enabling enrichment with cloud-native context. eBPF has created a silent infrastructure movement that is already everywhere and has enabled many new use cases that weren’t possible before. 
  • eBPF has been in production and production-proven for more than half a decade at the Internet scale, running 24/7 on millions of servers and devices worldwide.
  • eBPF has enabled new abstractions in the OS layer, which gives platform teams advanced capabilities for cloud-native networking, security, and observability to safely customize the OS to their workload’s needs.
  • Extending the OS kernel was a hard and lengthy process that could take years until a change could be used, but with eBPF, this developer-consumer feedback loop is now almost instant, where changes can be rolled out into production on the fly without having to restart or change the application or its configuration.
  • The next decade of infrastructure software will be defined by platform engineers who can use eBPF and the projects that leverage it to create the right abstractions for higher-level platforms. Open-source projects such as Cilium for eBPF-based networking, observability, and security have pioneered and brought this infrastructure movement to Kubernetes and cloud-native.

Kubernetes and cloud native have been around for nearly a decade. In that time, we’ve seen a Cambrian explosion of projects and innovation around infrastructure software. Through trial and late nights, we have also learned what works and what doesn’t when running these systems at scale in production. With these fundamental projects and crucial experience, platform teams are now pushing innovation up the stack, but can the stack keep up with them?

With the change of application design to API-driven microservices and the rise of Kubernetes-based platform engineering, networking, and security, teams have struggled to keep up because Kubernetes breaks traditional networking and security models. With the transition to cloud, we saw a similar technology sea change at least once. The rules of data center infrastructure and developer workflow were completely rewritten as Linux boxes “in the cloud” began running the world’s most popular services. We are in a similar spot today with a lot of churn around cloud native infrastructure pieces and not everyone knowing where it is headed; just look at the CNCF landscape. We have services communicating with each other over distributed networks atop a Linux kernel where many of its features and subsystems were never designed for cloud native in the first place.

The next decade of infrastructure software will be defined by platform engineers who can take these infrastructure building blocks and use them to create the right abstractions for higher-level platforms. Like a construction engineer uses water, electricity, and construction materials to build buildings that people can use, platform engineers take hardware and infrastructure software to build platforms that developers can safely and reliably deploy software on to make high-impact changes frequently and predictably with minimal toil at scale. For the next act in the cloud native era, platform engineering teams must be able to provision, connect, observe, and secure scalable, dynamic, available, and high-performance environments so developers can focus on coding business logic. Many of the Linux kernel building blocks supporting these workloads are decades old. They need a new abstraction to keep up with the demands of the cloud native world. Luckily, it is already here and has been production-proven at the largest scale for years.

eBPF is creating the cloud native abstractions and new building blocks required for the cloud native world by allowing us to dynamically program the kernel in a safe, performant, and scalable way. It is used to safely and efficiently extend the cloud native and other capabilities of the kernel without requiring changes to kernel source code or loading kernel modules unlocking innovation by moving the kernel itself from a monolith to more modular architecture enriched with cloud native context. These capabilities enable us to safely abstract the Linux kernel, iterate and innovate at this layer in a tight feedback loop, and become ready for the cloud native world. With these new superpowers for the Linux kernel, platform teams are ready for Day 2 of cloud native—and they might already be leveraging projects using eBPF without even knowing. There is a silent eBPF revolution reshaping platforms and the cloud native world in its image, and this is its story.

Extending a Packet Filter for Fun and for Profit

eBPF is a decades-old technology beginning its life as the BSD Packet Filter (BPF) in 1992. At the time, Van Jacobson wanted to troubleshoot network issues, but existing network filters were too slow. His lab designed and created libpcap, tcpdump, and BPF as a backend to provide the required functionality. BPF was designed to be fast, efficient, and easily verifiable so that it could be run inside the kernel, but its functionality was limited to read-only filtering based on simple packet header fields such as IP addresses and port numbers. Over time, as networking technology evolved, the limitations of this “classic” BPF (cBPF) became more apparent. In particular, it was stateless, which made it too limiting for complex packet operations and difficult to extend for developers.

Despite these constraints, the high-level concepts around cBPF of having a minimal, verifiable instruction set where it is feasible for the kernel to prove the safety of user-provided programs to then be able to run them inside the kernel have provided an inspiration and platform for future innovation. In 2014, a new technology was merged into the Linux kernel that significantly extended the BPF (hence, “eBPF”) instruction set to create a more flexible and powerful version. Initially, replacing the cBPF engine in the kernel was not the goal since eBPF is a generic concept and can be applied in many places outside of networking. However, at that time, it was a feasible path to merge this new technology into the mainline kernel. Here is an interesting quote from Linus Torvalds:

So I can work with crazy people, that’s not the problem. They just need to sell their crazy stuff to me using non-crazy arguments and in small and well-defined pieces. When I ask for killer features, I want them to lull me into a safe and cozy world where the stuff they are pushing is actually useful to mainline people first. In other words, every new crazy feature should be hidden in a nice solid “Trojan Horse” gift: something that looks obviously good at first sight.

This, in short, describes the “organic” nature of the Linux kernel development model and matches perfectly to how eBPF got merged into the kernel. To perform incremental improvements, the natural fit was first to replace the cBPF infrastructure in the kernel, which improved its performance, then, step by step, expose and improve the new eBPF technology on top of this foundation. From there, the early days of eBPF evolved in two directions in parallel, networking and tracing. Every new feature around eBPF merged into the kernel solved a concrete production need around these use cases; this requirement still holds true today. Projects like bcc, bpftrace, and Cilium helped to shape the core building blocks of eBPF infrastructure long before its ecosystem took off and became mainstream. Today, eBPF is a generic technology that can run sandboxed programs in a privileged context such as the kernel and has little in common with “BSD,” “Packets,” or “Filters” anymore—eBPF is simply a pseudo-acronym referring to a technological revolution in the operating system kernel to safely extend and tailor it to the user’s needs.

With the ability to run complex yet safe programs, eBPF became a much more powerful platform for enriching the Linux kernel with cloud native context from higher up the stack to execute better policy decisions, process data more efficiently, move operations closer to their source, and iterate and innovate more quickly. In short, instead of patching, rebuilding, and rolling out a new kernel change, the feedback loop with infrastructure engineers has been reduced to the extent that an eBPF program can be updated on the fly without having to restart services and without interrupting data processing. eBPF’s versatility also led to its adoption in other areas outside of networking, such as security, observability, and tracing, where it can be used to detect and analyze system events in real time.

Accelerating Kernel Experiments and Evolution

Moving from cBPF to eBPF has drastically changed what is possible—and what we will build next. By moving beyond just a packet filter to a general-purpose sandboxed runtime, eBPF opened many new use cases around networking, observability, security, tracing, and profiling. eBPF is now a general-purpose compute engine within the Linux kernel that allows you to hook into, observe, and act upon anything happening in the kernel, like a plug-in for your web browser. A few key design features have enabled eBPF to accelerate innovation and create more performant and customizable systems for the cloud native world.

First, eBPF hooks anywhere in the kernel to modify functionality and customize its behavior without changing the kernel’s source. By not modifying the source code, eBPF reduces the time from a user needing a new feature to implementing it from years to days. Because of the broad adoption of the Linux kernel across billions of devices, making changes upstream is not taken lightly. For example, suppose you want a new way to observe your application and need to be able to pull that metric from the kernel. In that case, you have to first convince the entire kernel community that it is a good idea—and a good idea for everyone running Linux—then it can be implemented and finally make it to users in a few years. With eBPF, you can go from coding to observation without even having to reboot your machine and tailor the kernel to your specific workload needs without affecting others. “eBPF has been very useful, and the real power of it is how it allows people to do specialized code that isn’t enabled until asked for,” said Linus Torvalds.

Second, because the verify checks that programs are safe to execute, eBPF developers can continue to innovate without worrying about the kernel crashing or other instabilities. This allows them and their end users to be confident that they are shipping stable code that can be leveraged in production. For platform teams and SREs, this is also crucial for using eBPF to safely troubleshoot issues they encounter in production.

When applications are ready to go to production, eBPF programs can be added at runtime without workload disruption or node reboot. This is a huge benefit when working at a large scale because it massively decreases the toil required to keep the platform up to date and reduces the risk of workload disruption from a rollout gone wrong. eBPF programs are JIT compiled for near native execution speed, and by shifting the context from user space to kernel space, they allow users to bypass or skip parts of the kernel that aren’t needed or used, thus enhancing performance. However, unlike complete kernel bypasses in user space, eBPF can still leverage all the kernel infrastructure and building blocks it wants without reinventing the wheel. eBPF can pick and choose the best pieces of the kernel and mix them with custom business logic to solve a specific problem. Finally, being able to modify kernel behavior at run time and bypass parts of the stack creates an extremely short feedback loop for developers. It has finally allowed experimentation in areas like network congestion control and process scheduling in the kernel.

Growing out of the classic packet filter and taking a major leap beyond the traditional use case unlocked many new possibilities in the kernel, from optimizing resource usage to adding customized business logic. eBPF allows us to speed up kernel innovation, create new abstractions, and dramatically increase performance. eBPF not only reduces the time, risk, and overhead it takes to add new features to production workloads, but in some cases, it even makes it possible in the first place.

Every Packet, Every Day: eBPF at Google, Meta, and Netflix

So many benefits begs the question if eBPF can deliver in the real world—and the answer has been a resounding yes. Meta and Google have some of the world’s largest data center footprints; Netflix accounts for about 15% of the Internet’s traffic. Each of these companies has been using eBPF under the hood for years in production and the results speak for themselves.

Meta was the first company to put eBPF into production at scale with its load balancer project Katran. Since 2017, every packet going into a Meta data center has been processed with eBPF—that’s a lot of cat pictures. Meta has also used eBPF for many more advanced use cases, most recently improving scheduler efficiency, which increased throughput by 15%, a massive boost and resource saving at their scale. Google also processes most of its data center traffic through eBPF, using it for runtime security and observability, and defaults its Google Cloud customers to using an eBPF-based dataplane for networking. In the Android operating system, which powers over 70% of mobile devices and has more than 2.5 billion active users spanning over 190 countries, almost every networking packet hits eBPF. Finally, Netflix relies extensively on eBPF for performance monitoring and analysis of their fleet, and Netflix engineers pioneered eBPF tooling, such as bpftrace, to make major leaps in visibility for troubleshooting production servers and built eBPF-based collectors for On-CPU and Off-CPU flame graphs.

eBPF clearly works and provides extensive benefits for “Internet-scale” companies and has been for the better part of a decade, but those benefits also need to be translated to the rest of us.

eBPF (R)evolution: Making Cloud Native Speed and Scale Possible

At the beginning of the cloud native era, GIFEE (Google Infrastructure for Everyone Else) was a popular phrase, but largely fell out of favor because not everyone is Google or needs Google infrastructure. Instead, people want simple solutions that solve their problems, which begs the question of why eBPF is different. Cloud native environments are meant to “run scalable applications in modern, dynamic environments.” Scalable and dynamic are key to understanding why eBPF is the evolution of the kernel that the cloud native revolution needs.

The Linux kernel, as usual, is the foundation for building cloud native platforms. Applications are now just using sockets as data sources and sinks, and the network as a communication bus. But cloud native needs newer abstractions than currently available in the Linux kernel because many of these building blocks, like cgroups (CPU, memory handling), namespaces (net, mount, pid), SELinux, seccomp, netfiler, netlink, AppArmor, auditd, perf are decades old before cloud even had a name. They don’t always talk together, and some are inflexible, allowing only for global policies and not per-container or per-service ones. Instead of leveraging new cloud native primitives, they lack awareness of Pods or any higher-level service abstractions and rely on iptables for networking.

As a platform team, if you want to provide developer tools for a cloud native environment, you can still be stuck in this box where cloud native environments can’t be expressed efficiently. Platform teams can find themselves in a future they are not ready to handle without the right tools. eBPF now allows tools to rebuild the abstractions in the Linux kernel from the ground up. These new abstractions are unlocking the next wave of cloud native innovation and will set the course for the cloud native revolution.

For example, in traditional networking, packets are processed by the kernel, and several layers of network stack inspect each packet before reaching its destination. This can result in a high overhead and slow processing times, especially in large-scale cloud environments with many network packets to be processed. eBPF instead allows inserting custom code into the kernel that can be executed for each packet as it passes through the network stack. This allows for more efficient and targeted network traffic processing, reducing the overhead and improving performance. Benchmarks from Cilium showed that switching from iptables to eBPF increased throughput 6x, and moving from IPVS-based load balancing to eBPF based allowed Seznam.cz to double throughput while also reducing CPU usage by 72x. Instead of providing marginal improvements on an old abstraction, eBPF enables magnitudes of enhancement.

eBPF doesn’t just stop at networking like its predecessor; it also extends to areas like observability and security and many more because it is a general-purpose computing environment and can hook anywhere in the kernel. “I think the future of cloud native security will be based on eBPF technology because it’s a new and powerful way to get visibility into the kernel, which was very difficult before,” said Chris Aniszczyk, CTO of Cloud Native Computing Foundation. “At the intersection of application and infrastructure monitoring, and security monitoring, this can provide a holistic approach for teams to detect, mitigate, and resolve issues faster.” 

eBPF provides ways to connect, observe, and secure applications at cloud native speed and scale. “As applications shift toward being a collection of API-driven services driven by cloud native paradigms, the security, reliability, observability, and performance of all applications become fundamentally dependent on a new connectivity layer driven by eBPF,” said Dan Wendlandt, CEO and co-founder of Isovalent. “It’s going to be a critical layer in the new cloud native infrastructure stack.”

The eBPF revolution is changing cloud native; the best part is that it is already here.

The Silent eBPF Revolution Is Already a Part of Your Platform

While the benefits of eBPF are clear, it is so low level that platform teams, without the luxury of Linux kernel development experience, need a friendlier interface. This is the magic of eBPF—it is already inside many of the tools running the cloud native platforms of today, and you may already be leveraging it without even knowing. If you spin up a Kubernetes cluster on any major cloud provider, you are leveraging eBPF through Cilium. Using Pixie for observability or Parca for continuous profiling, also eBPF. 

eBPF is a powerful force that is transforming the software industry. Marc Andreessen’s famous quote on “software is eating the world” has been semi-jokingly recoined by Cloudflare as “eBPF is eating the world.” However, success for eBPF is not when all developers know about it but when developers start demanding faster networking, effortless monitoring and observability, and easier-to-use security solutions. Less than 1% of developers may ever program something in eBPF, but the other 99% will benefit from it. eBPF will have completely taken over when there’s a variety of projects and products providing massive developer experience improvement over upstreaming code to the Linux kernel or writing Linux kernel modules. We are already well on our way to that reality.

eBPF has revolutionized the way infrastructure platforms are and will be built and has enabled many new cloud native use cases that were previously difficult or impossible to implement. With eBPF, platform engineers can safely and efficiently extend the capabilities of the Linux kernel, allowing them to innovate quickly. This allows for creating new abstractions and building blocks tailored to the demands of the cloud native world, making it easier for developers to deploy software at scale.

eBPF has been in production for over half a decade at the largest scale and has proven to be a safe, performant, and scalable way to dynamically program the kernel. The silent eBPF revolution has taken hold and is already used in projects and products around the cloud native ecosystem and beyond. With eBPF, platform teams are now ready for the next act in the cloud native era, where they can provision, connect, observe, and secure scalable, dynamic, available, and high-performance environments so developers can focus on just coding business logic.

About the Authors

Rate this Article

Adoption
Style

BT