InfoQ Homepage Load Balancing Content on InfoQ
-
Enhancing Reliability Using Service-Level Prioritized Load Shedding: Netflix at QCon SF 2025
At QCon San Francisco, Netflix engineers unveiled their advanced Service-Level-Prioritized Load-Shedding strategy, enhancing reliability during traffic spikes. By prioritizing high-value requests and automating management across microservices, they safeguard user experience and system stability. Key insights stress prioritization, automation, and structured load shedding for optimal resilience.
-
Amazon API Gateway Adds Dynamic Routing Based on Headers and Paths
AWS's new dynamic routing rules for Amazon API Gateway empower developers to streamline API traffic management by routing requests based on HTTP headers without complex URL structures. This innovative feature simplifies API versioning, enables fine-grained control, enhances A/B testing, and improves request visibility, making API configurations more efficient and user-friendly.
-
Cloudflare Introduces Advanced Load Balancing to Eliminate Hardware Dependency
Cloudflare recently unveiled significant advancements in its load balancing capabilities, aiming to eliminate the need for hardware-based solutions. The company’s latest enhancements integrate seamlessly with Cloudflare One, providing end-to-end private traffic flow support and WARP authenticated device traffic.
-
Grab Improves Kafka on Kubernetes Fault Tolerance with Strimzi, AWS AddOns and EBS
Grab updated its Kafka on Kubernetes setup to improve fault tolerance and completely eliminate human intervention in case of unexpected Kafka broker terminations. To address the shortcomings of the initial design, the team integrated with AWS Node Termination Handler (NTH), used the Load Balancer Controller for target group mapping, and switched to ELB volumes for storage.
-
Slack Migrates to Cell-Based Architecture on AWS to Mitigate Gray Failures
Slack migrated most of the critical user-facing services from a monolithic to a cell-based architecture over the last 1.5 years. The move was triggered by the impact of networking outages affecting a single availability zone, causing user-impacting service degradation. The new architecture allows incrementally draining all the traffic away from the affected availability zone within 5 minutes.
-
Allegro Uses Control Theory for Workload Balancing in its Apache Kafka PubSub Platform
Allegro, the largest eCommerce platform in Poland, implemented dynamic workload balancing in Hermes, its open-source publish-subscribe message broker, built on top of Apache Kafka. The new workload balancing algorithm achieves more uniform resource utilization and lower infrastructure costs.
-
Microsoft Azure Cross-Region (Global) Load Balancer Now Generally Available
Microsoft recently announced the general availability (GA) of Azure cross-region (Global) Load Balancer in all Azure public and national cloud regions.
-
Dropbox Unplugs Data Center to Test Resilience
Dropbox has published a detailed account of why and how they unplugged an entire data center to test their disaster readiness. The disaster readiness team began building tools to make performing frequent failovers possible, and ran their first formalized failover in 2019. Eventually, with new tooling and procedures, the data center was unplugged. This provided a significantly reduced RTO.
-
HashiCorp Consul-Terraform-Sync Adds Task Creation API and New Integrations
HashiCorp has released version 0.5 of Consul-Terraform-Sync. CTS enables automating common networking tasks by creating Terraform modules that can be run as services are added or removed from Consul. This release adds new secure API endpoints to facilitate modifying existing tasks, new ecosystem integrations, and support for triggering Terraform workflows on Consul key-value changes.
-
NGINX Controller Application Delivery Modules Improve Health Checks and Caching Configurations
NGINX has released new versions of their NGINX Controller Application Delivery Module, a control plane solution for NGINX Plus load balancers. The new features include enhanced workload health-checks, improvements to caching configuration, and instance groups.
-
AWS Announces Gateway Load Balancer
AWS Gateway Load Balancer is a new fully-managed network gateway and load balancer. The service is tailored to deploy, scale and manage third-party virtual appliances such as firewalls, intrusion detection, prevention systems and deep packet inspection systems in the cloud.
-
The ALB Ingress Controller Is Now the AWS Load Balancer Controller
AWS has rebranded the Application Load Balancer (ALB) Ingress controller as the AWS Load Balancer Controller, and now includes support for both Application Load Balancers and Network Load Balancers. The public vendor recently announced the renaming and updates to this Load Balancer controller, labeled as a new controller or AWS ALB Ingress Controller v2.
-
How Jetstack Set Up a Global Load Balancer for Multiple Kubernetes Clusters
Jetstack's engineering team talked about setting up a global load balancer across multiple Google Kubernetes Engine (GKE) clusters while utilizing Google’s non-standard container-native load balancing with Google Cloud Armor for DDoS protection.
-
High Availability for Self-Managed Kubernetes Clusters at DT One
The engineering team at DT One, a global provider of mobile top-up and reward solutions, wrote about how they implemented IP failover-based high availability for their self-managed Kubernetes cluster ingress on Hetzner’s hosting platform.
-
HAProxy EBtree: Design for a Scheduler, and Use (Almost) Everywhere
At QCON New York 2019, Andjelko Iharos presented how CTO Willy Tarreau and the HAProxy team implemented a scheduler using an EBtree data structure to optimize performance and memory usage of the HAProxy load balancer.