GoDaddy Releases Automatic Canary Deployments Tool for Kubernetes

GoDaddy recently released an open-source tool to automate gated deployments in Kubernetes. Every time a deployment happens, the tool can run regression tests, and pull metrics from data backends like New Relic. After some time, the tool decides whether to roll back or continue with the deployment automatically. Users can run A/B tests and run experiments with a small portion of live traffic.

The tool, Kubernetes Gated Deployments (KGD), extends the Kubernetes API through CRDs for automatic canary deployments. Kubernetes has native support for rolling updates, but not canary deployments. KGD’s operator adds the ability to introduce custom guardrails in a canary deployment before completing a full roll-out. There are existing tools for automatic canary deployments, like Kayenta using Spinnaker or Flagger, but these require additional components in the cluster. GoDaddy primarily uses native Kubernetes objects; that’s why KGD’s approach is to work with existing native objects without having to use any other tool.

KGD’s controller analyzes metrics from data backends for some time. After the validation time finishes, the KGD controller decides to either roll back or continue with the deployment automatically. A data backend could be services like New Relic, CloudWatch, or even logs from pods. As of today, the only backend KGD supports is New Relic.

Source: KGD GitHub project

To start using KGD, users have to deploy the operator into the cluster. One way to do it is by cloning the GitHub repository, and then run the following command:

kubectl apply -f gated-deployments.yml

Alternatively, Helm support is also available, and the command to install the KGD controller is the following:

helm install helm/kubernetes-gated-deployments --name kubernetes-gated-deployments

After installing the KGD controller, users need to configure the application as two deployments and one service. An existing deployment object will always have the current version–the control deployment. The additional deployment is where the experiments of a new version will run–the treatment deployment. To get started, users have to deploy a GatedDeployment object to configure which deployment objects to use and which decision plugins to determine if the deployment is successful or not.

Below is a sample of the manifest file for a GatedDeployment object:

apiVersion: 'kubernetes-client.io/v1'
kind: GatedDeployment
metadata:
  name: example-rest-service
deploymentDescriptor:
  control:
    name: example-rest-service-control
  treatment:
    name: example-rest-service-treatment
  decisionPlugins:
    - name: newRelicPerformance
      accountId: 807783
      secretName: newrelic-secrets
      secretKey: example-rest-service
      appName: example-rest-service
      minSamples: 50
      maxTime: 600
      testPath: /shopper/products

Traffic for the canary deployment is determined by the number of pods in the control and the treatment deployments. For instance, the control deployment has nine pod replicas configured. By having a configuration of one pod replica in the treatment deployment, the new version will receive only ten percent of the traffic. This approach is the only way to configure traffic splitting at the moment.

Source: KGD GitHub project

Once the KGD controller determines that the treatment deployment is not causing any problems, the treatment deployment is promoted. Now, the control deployment will match the treatment deployment specs, and the replicas in the treatment deployment will become zero. Conversely, if the treatment deployment is causing problems, the KGD controller will scale it down to zero replicas. And for any future application change, users will only make updates to the treatment deployment.

KGD can extend its functionality with decision plugins. A decision plugin contains the logic to pull metrics from a data backend and define the status of a canary deployment. For instance, the plugin can run regression tests and read logs from pods. Moreover, users can configure multiple decision plugins, but all plugins must return a "pass" status. Plugins are written in Node, and there’s a guide on how to contribute with new plugins.

InfoQ Software Architects' Newsletter

Login with:

Don't have an InfoQ account?

InfoQ Article Contest

Rate this Article

This content is in the DevOps topic

Related Topics:

Related Editorial

Related Sponsored Content

Popular across InfoQ

The InfoQ Newsletter