Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ


Choose your language

InfoQ Homepage Articles Continuous Deployment with Containers

Continuous Deployment with Containers

The open source release of Docker in March 2013 triggered a major shift in the way in which the software development industry is aspiring to package and deploy modern applications. The creation of many competing, complimentary and supporting container technologies has followed in the wake of Docker, and this has lead to much hype, and some disillusion, around this space. This article series aims to cut through some of this confusion, and explains how containers are actually being used within the enterprise.

This articles series begins with a look into the core technology behind containers and how this is currently being used by developers, and then examines core challenges with deploying containers in the enterprise, such as integrating containerisation into continuous integration and continuous delivery pipelines, and enhancing monitoring to support a changing workload and potential transience. The series concludes with a look to the future of containerisation, and discusses the role unikernels are currently playing within leading-edge organisations.

This InfoQ article is part of the series "Containers in the Real World - Stepping Off the Hype Curve". You can subscribe to receive notifications via RSS.


Many of us have already experimented with Docker — for example, running one of the pre-built images from Docker Hub. It is possible that your team might have recognized the benefits that Docker, in conjunction with experimentation, provides in building microservices and the advantages the technology could bring to development, testing, integration, and, ultimately, production.

However, you must create a comprehensive build pipeline before deploying any containers into a live environment.

Integrating containers into a continuous-delivery pipeline is far from easy. Along with the benefits Docker brings, there are challenges both technological and process related. This article attempts to outline the steps you need to take for a fully automated continuous-deployment pipeline that builds microservices deployed into Docker containers.

The continuous-deployment pipeline

The continuous-deployment pipeline is a set of steps executed on each CMS commit. The objective of the pipeline is to perform a set of tasks that will deploy a fully tested and functional service or application to production. The last human action is to make a commit to a repository; everything else is done automatically. Such a process increases reliability by (partly) eliminating human error and increases throughput by letting machines do what they do best (running repeatable processes that do not require creative thinking). The reason every commit should pass through the pipeline lies in the word “continuous”. If you postpone the process and, for example, run it at the end of a sprint, neither testing nor deployment would be continuous.

By postponing testing and deployment to production, you postpone the discovery of potential problems and as a result increase the effort required to correct them. Fixing a problem a month after the problem was introduced is more expensive than fixing that problem after only a week. Similarly, if only a few minutes elapse between committing code and notification of a bug, the time required to locate the issue is almost negligible. However, continuous deployment is not only about savings in maintenance and bug fixing. It allows you to get new features into production much faster. The less time that passes between the development of a feature and its availability to users, the faster you start to benefit from it.

Instead of listing the build steps that should be included within the pipeline, let’s start with the absolute minimum and drive towards one possible solution. The minimum set of steps would allow you to test the service, build it, and deploy it. All those tasks are required. Without testing, we have no guarantee that the service works. Without building it, there is nothing to deploy. Without deploying it, our users can’t benefit from the new release.


You traditionally test software by running unit tests against the source code. While that gives you high code coverage, it does not necessarily prove that the features are working as expected or that the individual units of code (methods, functions, classes) are behaving as designed. To validate features, you add functional tests, which tend to orient towards black-box testing not directly related to the code. One issue of functional tests is system dependencies. Java applications might require a specific JDK. Web applications might need to be tested against a myriad of browsers. It is not uncommon to repeat the same set of tests with many different system combinations.

The sad truth is that many organizations do not have tests that are extensive enough to ensure that a release can be deployed to production without human intervention. Even when tests are truly reliable, they often do not run under all the circumstances you expect to see in production. The reason behind that lies in the way we manage infrastructure. When operated manually, infrastructure is expensive to set up. How many servers would you need to configure all of the browsers you expect your users to use? Ten? A hundred? What happens when one project has different run-time dependencies than another?

Most organizations use multiple, different environments. One runs on Ubuntu while another has Red Hat. One uses JDK8 while another hosts JDK7. That is an expensive approach, especially if those environments are static (as opposed to a "create and destroy" approach through cloud hosting). Even if you set up enough servers to fulfill all necessary combinations, you still face issues of speed and flexibility. For example, if a team decides to develop a new service or refactor an existing one to a different stack, considerable time might pass between the request for a new environment and the point at which it is fully operational. During this period, continuous deployment is on hold. When you add microservices to this mix, everything is exponentially multiplied. While developers had only a few applications to worry about in the past, now you must deal tens, hundreds, or even thousands of services. After all, the benefits of microservices include the flexibility to choose the best technology for a given use case and speedy releases. You do not want to wait until a whole system is developed but to release a functionality limited to a single microservice as soon as it's done. A single bottleneck is enough to reduce that speed to a crawl and, in many cases, that bottleneck is infrastructure.

You can easily handle many of the testing problems by using Docker containers. Everything a service or application needs for testing can and should be placed inside a container. Take a look at the Dockerfile.test. It is a container used for testing a microservice that uses Scala for the back end, Polymer for the front end, and MongoDB as a data store. It tests an entirely self-sufficient service vertically split from the rest of the system. I won't go into details of that Dockerfile definition, but I will list what it has inside. It contains Git, NodeJS, Gulp, and Bower for the front end. Scala, SBT, and MongoDB are needed for the back end. Some of the tests use Chrome and Firefox. The service source code and all dependencies are there as well. I'm not suggesting that you should choose this stack for your services, but I am instead trying to point out that in many cases tests require quite a few run-time and system dependencies.

Requesting a server with all that would mean a long wait until everything is set up in best of cases. More likely than not, you would start experiencing conflicts and problems after a dozen similar requests for other services. Servers are not made to host an infinite number of potentially conflicting dependencies. You could use VMs dedicated for testing a single service, but that would mean a lot of wasted resources and slow initializations. With Docker containers, you move this job away from infrastructure teams and place it in hands of developers. They choose what their application needs during the testing phase, specify that in a Dockerfile, and let their continuous-deployment tool of choice build and run the container that will execute all the tests they need. Once the code passes all tests, you can proceed to the next stage and build the service itself. The tests container should be pushed to a Docker registry (private or public) so that you can reuse it on later occasions. Apart from the benefits already mentioned, when a test run ends, you can remove the container, leaving the host server in the same state it was before. That way, you can reuse the same server (or cluster of servers) for testing all the services you are developing.

The diagram just got a bit more complex.


Now that you’ve run all tests, you can build the container that you ultimately will deploy to production. Since you likely will deploy it to a different server than the one on which you’re building it, you should also push it to a Docker registry.

Once you’ve tested and built the new release, you are ready to deploy it to the production server. All you have to do is pull the images and run the container.


With containers available from the registry, you can start deploying your microservices after each commit and deliver new features to users faster than ever. The business is happy and showers you with awards and you go home knowing you did something great and useful.

But what the process defined thus far is far from a complete continuous-deployment pipeline. It is missing quite a few steps, things to consider, and paths to take. Let's identify and tackle the problems one by one.

Safely deploying through the blue-green process

Probably the most dangerous step in the pipeline is deployment. If we pull a new release and run it, Docker Compose will replace the old one with the new one. As a result, there will be some downtime during the process. Docker needs to stop the old release, start the new one, and your service needs to initialize. Whether this process lasts a few minutes, a few seconds, or even microseconds, there is still some downtime. If you adopt microservices and continuous deployment, releases will come more often than before. You eventually will be deploying multiple times a day. No matter how often you release, interruption is always something to avoid.

The solution lies in blue-green deployment. If you are new to this topic, please read my “Blue-Green Deployment” article. In a nutshell, the process deploys a new release in parallel with the old one. One is called “blue”, and the other is called “green”. Since both are running in parallel, there is no downtime (at least not due to the deployment process). Running both releases in parallel opens some new possibilities but also creates a few new challenges.

The first thing to consider when practicing blue-green deployment is how to redirect user traffic from the old release to the new one. Before, you simply replaced one release with the other, so both would run on the same server and the same port. With blue and green versions running in parallel, each needs its own port. Chances are that you are already using some proxy service (NGINX, HAProxy, etc.). One new challenge is that the proxy cannot be static anymore. Its configuration should continuously change with each new release. If you are deploying to a cluster, things get even more complicated. Not only ports change, so do IP addresses. To use a cluster effectively, you should deploy services to the most suitable server at that moment. The criteria for deciding which server is most fit should be based on available memory, type of hard disk, CPU, and so on. That way, you can best distribute services and significantly optimize the usage of available resources. That poses new problems. The most pressing one is how to find the IP addresses and ports of services you are deploying. The answer lies in service discovery.

Service discovery consists of three parts. You need a service registry where you store service information. Then, you need a process that will register new services and de-register those that you stop. Finally, you need a way to retrieve service information. For example, when you deploy a new release, the registration process should store the IP address and port in the service registry. The proxy can later discover that information and use it to reconfigure itself. Some of the commonly used service registries are etcd, Consul, and ZooKeeper. You can use Registrator for registering and de-registering services and confd and Consul Template for service discovery and templating. For more about service discovery and those tools, please read my “Service Discovery: Zookeeper vs etcd vs Consul” article.

Now that you have a mechanism for storing and retrieving service information and you can use it to reconfigure the proxy, the only question left unanswered (for now) is which color to deploy. When deploying manually, you know that the previous color was, for example, green and that the next one should be blue. When everything is automated, you need to store that information where it is available for the deployment flow. Since you’ve already established service discovery as part of the process, you can register the color together with service IP address and port for retrieval when needed.

Taking all that into the account, the pipeline is as displayed in the following diagram. Since the number of steps is increasing, I have split them into pre-deployment, deployment, and post-deployment groups.

Running pre-integration and post-integration tests

You might have noticed that the first step in the deployment pipeline is to run tests. While the running of tests is paramount, and also provides confidence that the code is (most likely) working as expected, it does not verify that the service to be deployed to production is truly working as expected. Many things could have gone wrong. Maybe you did not set up the database correctly or perhaps firewalls are preventing access to the service. The list of things that might impede a service from working properly in production is far from short. Thinking that code is working as expected is not the same as verifying that what you deployed is correctly configured. Even setting up a staging server, deploying your service inside it, and running another round of tests cannot make you fully confident that the same results will always be found in production. To distinguish different types of tests, I'll call those that I defined earlier “pre-deployment” tests. I'm intentionally avoiding a more accurate name because the kind of tests you'll run at that early stage changes from one project to another. They can be unit tests, functional tests, and so on. No matter the type, what they all have in common is that you run them before you build and deploy the service.

The blue-green process presents a new opportunity. Since both the old and the new releases of the service run in parallel, you can test the latter before you reconfigure the proxy to point to it. That way, you can safely deploy the new release to production and test it while the proxy continues to redirect your users to the older release. I tend to call test at this phase “pre-integration” tests. The name might not be the best one to use since many developers are familiar with a different meaning of “integration tests”, but in this particular case, it means tests that you run before the integration of the new release with the proxy service (before the proxy is reconfigured). Those tests let you skip staging environments (which are never fully the same as production) and test the new release with exactly the same settings that users will use when the proxy is reconfigured. Well, the word “exactly” is not entirely accurate — the difference is that you'll test the service without the proxy and users should not be allowed to access it any other way but through it. As with pre-deployment tests, the result of pre-integration tests can be a signal to proceed with the flow or to abort the process in case of failure.

Finally, after we reconfigure the proxy, we should execute another round of testing — this time, called “post-integration” tests. Those should be fast since the only thing left to verify is whether the proxy is indeed configured correctly. That usually means that a few tests that make requests on ports 80 (HTTP) and 443 (HTTPS) are enough.

Once you have adopted Docker, you should run all these tests as containers, in the same way I recommend you run pre-deployment testing. The benefits are the same and in many cases the same testing container can be used for all testing types. I tend to pass an environment variable that indicates which types of tests should be run.

Rolling back and cleaning up

Before we revisit the successful and failing outcomes of the testing steps, let’s define the desired state of the production environment. The logic is simple. If any part of the flow fails, the environment should maintain the same state, as if the process were not even initiated. Apart from triggering some form of notification of the problem and creating a culture where fixing the culprit is of utmost importance and priority, there is not much more to do. The problem is that rollback is often not as easy as it sounds. Fortunately, Docker containers make that task easier than any other approach since there are very few side effects on the environment itself.

Deciding what to do with results of pre-deployment tests is easy. You deployed nothing so you can either continue the flow or stop it. On the other hand, the steps you executed before reaching the result of pre-integration tests did leave the production environment in an undesirable state. You deployed the new release (blue or green) and if it is faulty you should remove it. Follow that removal with the elimination of any service data generated by that release. Because the proxy is still pointing to the old release, users will continue using it oblivious of your attempt to deploy a new feature. The last set of tests adds additional obstacles. As the proxy has been changed to point to the new release, you need to revert to the original condition. The same holds true for the registration of the deployed color.

While I’m focusing only on failures caused by tests, that does not mean that other steps of the flow cannot fail. They can, and similar logic should be applied to resolve the problem. No matter what fails, the environment needs to revert to the state it was before.

Even if everything goes as planned, there is still some cleaning up to do. You need to stop and de-register the old release.

I did not mention databases even though they probably pose the greatest challenge during the rollback phase. The scope of the subject is too wide for this article, so I'll just state what I consider the primary rule: always make sure that schema changes that accompany the new release are backward compatible and make sure that there is an extensive set of tests that confirms it. You must run them during pre-deployment testing or risk reaching the point of no return. I often receive complaints that backward compatibility is not feasible. While that is true in some cases, more often than not, that opinion comes from the waterfall era when teams were releasing products once a month or even once a year. If the pipeline cycle is short and you execute it on every commit (assuming that we do not commit less than once a day), the changes that should be applied to databases are generally small enough that backward compatibility can be accomplished with relative ease.

Deciding where to run each step

Deciding where to execute each step is of utmost importance. As a general rule, you want to do as little as possible inside production servers. That means that all but deployment-related tasks should run in a separate cluster dedicated to continuous deployment. I painted such tasks in yellow and those that do need to run in production in blue in the following diagram. Please note that even blue tasks should not run directly in production but through APIs of your tools. For example, if you deploy containers with Docker Swarm, you do not need to enter the server in which the master resides; instead, create the DOCKER_HOST variable that will instruct the local Docker client of the final destination.

Finishing the continuous-deployment flow

Being able to reliably deploy each commit to production is only half the work. The other half is related to monitoring deployments and acting depending on real-time and historical data. Since the end goal is automation of everything after code commits, assume that human interaction will be reduced to an absolute minimum. Creating a self-healing system is a challenge that requires continuous tuning. You want your system to be able to recuperate from failures (reactive healing) as well as to prevent those failures from happening in the first place (preventive healing).

If for some reason a service process stops, the system should instantiate it again. If the reason for the failure is, for example, a node that has become unreliable, that instantiation should happen on a different (healthy) server. The major pieces for reactive healing are tools that collect data, continuously monitor services, and run actions in cases of failure. Preventive healing is much more complicated and requires a database with historical data that can be evaluated against patterns to allow for predictions that something undesirable will happen in the future. Preventive healing might discover that traffic has been steadily increasing and in a few hours will require scaling. Or it might realize that every Monday morning, traffic spikes and the system requires scaling and, later on, de-scaling when traffic returns to normal. If you are interested in more details, please read my articles on “Self-Healing Systems” and “Centralized Logging and Monitoring”.


Which tools to use as part of the continuous deployment flow depends on individual preference and the specifics of a project. Instead of making a definitive recommendation, I'll discuss my personal preferences.

Docker is the obvious choice for any architecture based on microservices. I would even go as far as saying that without containers (Docker or any other type), microservices produce more problems than solutions. You can find more information in the “Microservices: The Essential Practices” article. As a proxy, both NGINX and HAProxy work great. Each has its downsides but overall you can hardly go wrong with either.

Anything but a minuscule cluster needs an orchestrator. I prefer Docker Swarm as it currently provides more freedom than other solutions. On the other hand, it comes with fewer tools packaged inside the distribution. You need to build things by yourself (I'd say that's the cost of freedom). Kubernetes is more mature and comes with more functionality out of the box. Mesos was not initially designed to work with Docker, but gained Docker support as an afterthought. For a more detailed comparison, please read my “Docker Clustering Tools Compared: Kubernetes vs Docker Swarm” article.

Finally, my preferred CI/CD server is Jenkins. Bear in mind that implementing the flow as described in this article is painful and costly to maintain when chaining together freestyle jobs. Instead, the preferable way to go is through the Pipeline and the CloudBees Docker Pipeline plugins. For more information about Jenkins Pipeline, please read my “The Need For Jenkins Pipeline and “Jenkins Pipeline articles.

The DevOps 2.0 Toolkit

If you liked this article, you might be interested in my The DevOps 2.0 Toolkit: Automating the Continuous Deployment Pipeline with Containerized Microservices book. Among many other subjects, it explores microservices, containers, and continuous deployment in much more detail.

The book is about different techniques that help us design software in a better and more efficient way with microservices packed as immutable containers, tested and deployed continuously to servers that are automatically provisioned with configuration management tools. It's about fast, reliable, continuous deployments with zero downtime and the ability to roll back. It's about scaling to any number of servers, the design of self-healing systems capable of recuperation from both hardware and software failures, and centralized logging and monitoring of the cluster.

In other words, this book envelops the whole microservices development and deployment lifecycle using some of the latest and greatest practices and tools. It looks at Docker, Kubernetes, Ansible, Ubuntu, Docker Swarm and Docker Compose, Consul, etcd, Registrator, confd, Jenkins, and so on. I go through many practices and even more tools.

The book is available from Leanpub and Amazon ( and other worldwide sites).

About the Author

Viktor Farcic is a Senior Consultant at CloudBees. He coded using a plethora of languages starting with Pascal (yes, he is old), Basic (before it got Visual prefix), ASP (before it got .Net suffix), C, C++, Perl, Python,ASP.Net, Visual Basic, C#, JavaScript, etc. He never worked with Fortran. His current favourites are Scalaand JavaScript even though most of his office hours are spent with Java. His big passions are MicroservicesContinuous Integration, Delivery and Deployment (CI/CD) and Test-Driven Development (TDD).


The open source release of Docker in March 2013 triggered a major shift in the way in which the software development industry is aspiring to package and deploy modern applications. The creation of many competing, complimentary and supporting container technologies has followed in the wake of Docker, and this has lead to much hype, and some disillusion, around this space. This article series aims to cut through some of this confusion, and explains how containers are actually being used within the enterprise.

This articles series begins with a look into the core technology behind containers and how this is currently being used by developers, and then examines core challenges with deploying containers in the enterprise, such as integrating containerisation into continuous integration and continuous delivery pipelines, and enhancing monitoring to support a changing workload and potential transience. The series concludes with a look to the future of containerisation, and discusses the role unikernels are currently playing within leading-edge organisations.

This InfoQ article is part of the series "Containers in the Real World - Stepping Off the Hype Curve". You can subscribe to receive notifications via RSS.

Rate this Article