Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ


Choose your language

InfoQ Homepage News Kayenta: An Open Source Canary Analysis Tool from Netflix and Google

Kayenta: An Open Source Canary Analysis Tool from Netflix and Google

This item in japanese

Kayenta is an open source automated canary analysis tool used to evaluate the readiness for production of a new version of a software. Kayenta is based on a tool developed internally by Netflix, and it was integrated with Google’s help into Spinnaker to perform automated canary release across multiple clouds. The new software version that is analyzed for readiness can contain code changes and/or configuration tweaks.

A canary release is a technique to reduce the risk from deploying a new version of software into production. A new version of software, referred to as the canary, is deployed to a small subset of users alongside the stable running version. Traffic is split between these two versions such that a portion of incoming requests are diverted to the canary. 

To analyze a canary release, Kayenta needs to probe it and compare the results with those coming from a production baseline. Theoretically, Kayenta could compare the canary with actual production systems, but that would provide a statistically skewed result because the production system has been running for some time. Creating a brand new baseline cluster ensures that the metrics produced are free of any effects caused by long-running processes.

Spinnaker runs a canary and a new baseline cluster in parallel, in addition to the production system that is accessed by customers. These clusters typically include three instances each, but that is not a fixed setting. After that, the requests coming from a small number (~1%) of actual customers are directed towards these clusters, and a series of performance and functionality metrics are collected and logged in a time-series database, being later automatically compared to see how the canary stands against the baseline. This step of the process is called judgement and concludes with an overall score in the range 0-100. This step can be executed multiple times, not just once.

Netflix Kayenta traffic routing
Netflix Canary Release Process (image from Netflix Tech Blog)


The judgement score falls into one of three categories: success – and the canary is promoted to deployment; marginal – possibly calling for human intervention to decide what to do about the release; and failure – when the whole pipeline is stopped and rolled back, and the incoming traffic is directed to the production system.

Kayenta is integrated with various monitoring tools: Stackdriver, Prometheus, Datadog and Netflix Atlas. Others could be used because the entire system is designed to be pluggable, including metric sources, judgement systems, and results storages.

Due to its integration with Spinnaker, Kayenta can be used to analyze and deploy canaries on supported platforms such as AWS, GCP, Azure, Openstack, Kubernetes or hybrid environments.

Netflix is in the process of moving their entire canary deployment system to Kayenta, a process that will be done over the next few months, according to Netflix. Kayenta is currently running some 200 judgements a day, representing about 30% of their complete load. Netflix added that "Kayenta has increased developer productivity by providing engineers with a high degree of trust in their deployments."

Rate this Article