BT

Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ

Topics

Choose your language

InfoQ Homepage News How Jetstack Set Up a Global Load Balancer for Multiple Kubernetes Clusters

How Jetstack Set Up a Global Load Balancer for Multiple Kubernetes Clusters

This item in japanese

Bookmarks

Jetstack's engineering team talked about setting up a global load balancer across multiple Google Kubernetes Engine (GKE) clusters while utilizing Google's non-standard container-native load balancing with Google Cloud Armor for DDoS protection.

One of Jetstack’s customers had an existing setup consisting of multiple Kubernetes clusters with DNS based routing to different load balancer IPs. They wanted to incorporate Cloud Armor DDoS protection, and use container-native load balancing to "improve traffic visibility and network performance". The team went through multiple migration phases to introduce these features and a custom way to tie in a single GLB with more than one Kubernetes cluster backends.

Kubernetes has three different ways of "load balancing" in its spec, not including Ingress, at the Service level - ClusterIP, NodePort and LoadBalancer. Jetstack’s customer utilized the "LoadBalancing" service type, which translates to an implementation specific LB based on the underlying cloud platform. In GKE, it is implemented by a network LB (NLB). However, to accept traffic from the internet, a Kubernetes cluster typically has an Ingress which is implemented by a global LB (GLB) in GKE. The customer's previous setup had geolocation based IP addresses routing in AWS Route53. Route 53 can return different IP addresses depending on where the DNS queries originate.

Google's NLBs do not support the Cloud Armor DDoS protection service, although the Cloud Armor configuration supports Layer 3-7 network rules. So switching to an L7 LB - i.e. a Global Load Balancer (GLB) - was necessary. Creating an Ingress resource in GKE automatically creates this. An L7 GLB brings with it flexibility in routing URLs and TLS termination at the load balancer itself, and restricts traffic serving ports to 80, 8080 and 443. The latter resulted in some changes in the app which previously used multiple other ports. There were still multiple L7 load balancers at the end of this phase, with DNS pointing to their IP addresses.

GKE has a feature called "container-native load balancing" which allows pods to directly receive traffic from a load balancer. This is not part of the Kubernetes spec but an optimization in GKE, and thus cannot be used in other vendors' managed Kubernetes offerings. Without this, traffic from an LB to a pod takes a circuitous route inside GKE's network. The extra network hops involved in this can increase latency. Container native load balancing requires creating Network Endpoint Groups (NEGs) - a Google specific feature - which contain the IP addresses of the backend pods that service traffic. The second phase of migration included this.

In the third phase, the primary change was to use a single GLB IP address instead of using DNS to return region-specific IP addresses of different load balancers. Kubernetes does not have a mechanism to include multiple clusters behind a single Ingress. There is a beta tool from Google which attempts to do this but it is in an early stage. It is important to note that having a single GLB (or another Ingress LB) for multiple Kubernetes clusters is different from having multiple Kubernetes clusters working together. The former is about using a single endpoint through which global traffic gets routed to independent Kubernetes clusters. The latter is about using a single control plane for multiple Kubernetes clusters. Doing the former in other clouds is also not simple. Using Terraform's Google provider for automation, Jetstack's team created NEG resources and the GLB separately, and tied them together with annotations. There is another tool that purports to make this easier. Other companies have solved this in other ways - e.g. using an Envoy control plane, and by using Cluster Registry.

Rate this Article

Adoption
Style

BT