BT

InfoQ Homepage News High Availability for Self-Managed Kubernetes Clusters at DT One

High Availability for Self-Managed Kubernetes Clusters at DT One

Bookmarks

The engineering team at DT One, a global provider of mobile top-up and reward solutions, wrote about how they implemented IP-failover based high availability for their self-managed Kubernetes cluster ingress on Hetzner’s hosting platform.

DT One runs their Kubernetes clusters on bare metal machines on Hetzner. The cluster has an nginx-based Kubernetes ingress which exposes services to the internet. After trying various approaches to achieve high-availability (HA) for the ingress nodes, they settled on a Puppet-automated IP-failover based solution leveraging Hetzner's "vSwitch" virtual network.

Kubernetes clusters expose services to external networks like the internet by using a Layer 7 (L7) ingress. Most cloud providers that provide managed Kubernetes also provide an ingress implementation with a load balancer. However, self-managed Kubernetes ingresses usually depend on nginx as a load balancer.

To add high-availability to such setups, Kubernetes needs a VIP + keepalived like solution when there are multiple IPs exposed for external traffic. keepalived is a tool that provides HA using the Virtual Router Redundancy Protocol (VRRP) by switching a virtual IP between hosts. For example, there might be multiple ingress nodes that are configured in round-robin DNS. When a node fails, it has to be removed manually from DNS. If VIP is used, the DNS name will point to just a single IP (the virtual IP) and keepalived will ensure it always points to a live node running ingress. For cloud platforms like GCP, AWS and Azure that provide a load balancer, VIPs are unnecessary as the platform takes care of providing an HA load balancer. However, on platforms where the LB is managed by the customer, VIP can provide HA.

InfoQ got in touch with Jan Hejl, DevOps tech lead at DT One, to understand more about the solution.

Usually, the ingress ports are bound to the main host's IP. Hetzner provides a failover IP feature where an IP address (or even a subnet) can be switched from one server to another, irrespective of the server’s location within 60 seconds. The team initially used custom Python scripts to switch Hetzner's failover IPs between ingress nodes, managed by keepalived. They later adopted a modified version of an existing solution, but it had some drawbacks like being forced to use encrypted VRRP and stick to IPv4. The newer VRRPv3 protocol supports IPv6, but encryption was not possible. Hejl explains the security issues:

A bare-metal machine from Hetzner is part of a /29 or even a /26 subnet, so others can sniff something (say, using tcpdump) that is not part of their own traffic. Especially in the case where the IPs are within the same subnet, spoofing the multicast IP address is not that hard even though you have implemented things like arp_ignore / rp_filter etc.

Since it's a self-managed L7 ingress, how does DT One protect against DDoS like attacks? Hejl explains that "Hetzner is the first level of defense and then there are our own firewalls."

DT One uses Puppet for almost everything, says Hejl, with Terraform for automating Hetzner virtual machines or AWS deployments. Puppet was also used to automate the initial solution. This was superseded by a feature that Hetzner introduced last year called vSwitch. vSwitch allocates a separate Layer 2 (L2) network for customer machines, which means that unencrypted VRRP traffic becomes possible without the security concerns. However, there were still issues with Hetzner's failover IPs. The  time taken to reflect changes (~30 seconds) across the network was too long, and it was susceptible to any outages that might occur at Hetzner.

The team finally arrived at a working solution using keepalived and three physical hosts that communicate over a separate vSwitch network, automated using Puppet. Each of the nodes acts as a leader for the other two VIPs, with the remaining two as followers. keepalived supports email notifications when the status of a node changes. In addition, Hejl says, they use Prometheus, Grafana and Alertmanager for monitoring and alerting their systems.

Rate this Article

Adoption
Style

Hello stranger!

You need to Register an InfoQ account or or login to post comments. But there's so much more behind being registered.

Get the most out of the InfoQ experience.

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Community comments

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

BT

Is your profile up-to-date? Please take a moment to review and update.

Note: If updating/changing your email, a validation request will be sent

Company name:
Company role:
Company size:
Country/Zone:
State/Province/Region:
You will be sent an email to validate the new email address. This pop-up will close itself in a few moments.