Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ


Choose your language

InfoQ Homepage Articles "The Docker Book" Review and Author Q&A

"The Docker Book" Review and Author Q&A

"The Docker Book", by James Turnbull, is a hands-on book for everyone who wants to learn about Docker. It will take you from your first installation, through simple examples that explain Docker's concepts, to more complex scenarios that shed some light on how you would use Docker in the real world.

The book starts by describing what Docker is and its purpose. It also describes all the major concepts, starting with images, containers and registries. Images are the basic building blocks, which James describes as "the 'source code' for your containers": highly portable and easy to update. Containers are the "runtime" component: a self-contained execution environment, based on an image. Images and containers are what makes Docker stand apart. Registries are the mechanism through which images are managed, stored and shared. They play more or less the same role for Docker images that GitHub and friends play for source code.

After learning the basic concepts, you start from the very beginning, that is, Docker's installation. You are taught how to install Docker on several Linux distributions, Windows and OSX. James lists several UI interfaces for managing Docker, but their overall immaturity makes clear that in Docker-land, command-line is king. You'll be proficient with most Docker command-line sub-commands by the time you finish the book.

Once you have Docker installed in your system, James teaches you to interact with Docker through the command-line. Among other basic scenarios he shows you how to create containers (docker run), how to inspect them (docker inspect) or see what's running inside them (docker ps).

The next step is to learn how to create images and store them in repositories. Again, Docker repositories can be seen like Git repositories: Docker repositories store Docker images; Git repositories store code. There are two ways to create images: using docker commit or Dockerfiles. Since images use a layered format, built upon Union file systems, Docker uses the metaphor of a commit to create new image layers. Image layers are much like a new version in a source control tool. Indeed, as you surely understood by now, Docker leans heavily on the source control metaphor.

Although you can use docker commit, which creates a new image layer with the changes you made to a container, the recommended way to create new images is through Dockerfiles. James spends quite some time explaining this core concept to Docker. You'll learn both the Dockerfile format in detail as well as its runtime execution flow. For instance, given that each Dockerfile instruction creates a new image layer, Docker gives you a time machine to go back in time in a image's history. Also, if an instruction fails, you can run the image with the last's successful instruction and debug from there. This is one of the very powerful constructs Docker provides, which radically changes how servers are built.

Once you have your Dockerfiles ready and your images built, it's time to push them to Docker registries. James explains how to use Docker's own DockerHub and how it integrates with GitHub or BitBucket to provide automated image builds. Automated Build, previously named Trusted Builds, is a workflow that automatically builds and stores an image on DockerHub when you push code to a GitHub or BitBucket repository. James also mentioned on the initial book edition alternative registries such as Quay or Orchard and how to build your own registry, when your context requires it. Underlining the shifting landscape, after the book was published CoreOS bought Quay and Docker bought Orchard. Interestingly, James has been updating the book to keep up with these relevant events.

James Turnbull uses more elaborate scenarios to convey the idea of how Docker might be applied on the real world. He first uses a continuous integration scenario, using Jenkins to run multi-configuration build jobs. Then he uses Docker to build and serve a Jekyll web site. He finishes with a web application built on Node.js and a fully replicated Redis backend. The flow is similar in each scenario, with the most interesting bit being the fact that Docker is used for everything. The host machine executes Docker and nothing more. Volumes and Docker's networking capabilities are presented through these scenarios.

When you stop a container, all changes are discarded, unless they are committed, thus creating a new image layer. This is not ideal for application's data. Volumes are Docker's answer to this problem, as they enable persistence by bypassing the Union File System. James uses the above scenarios to show how volumes are key to any real world Docker usage.

When it comes to networking, James explains various options. The obvious one of exposing container's ports on the host's local network is introduced early in the book. Docker's internal network feature is discussed later. Upon installation Docker creates an interface on the host, called docker0, which connects the host with all its containers. We can create a virtual subnet this way, although James warns about two big drawbacks, related to the direct use of IPs: they would have to be hard-coded on the application's configurations and, if that were not enough, a container's IP changes on restart.

The last option that the book discusses on networking is linking. A link creates a parent-child relationship between two containers, allowing the parent to use a port to talk to the child that no other container may use. This port is not exposed to the host, providing a high degree of security. Sadly, but as might be inferred from this description, linking only works between containers that share the same host.

Given the limited scope of each container, it quickly becomes apparent the need to orchestrate all the containers that will be needed to provide a complex service. James devotes a chapter to Figan orchestration tool for multi-container services, and Consul, a highly resilient service discovery tool, showing how they might be used in more complex use cases.

Docker's APIs get their own chapter in the book, with an emphasis on the Remote API. The Remote API is a REST interface that provides similar capabilities to the command-line and it is the natural way to automate Docker management and configuration. These API's are not at the core of the book tough, so they only get a cursory introduction. That being said, this introduction is enough in the book's context. They provide a different channel to communicate with Docker, but they use the same concepts and provide the same capabilities discussed throughout the rest of the book.

The book expects the reader to have basic familiarity with Linux, its command-line shells, packages (yum or apt), service management and basic networking. Everyone with either a developer or operations background should feel right at home. Even if you have a Windows background, it shouldn't be too difficult to follow the examples. The listings are quite complete and, when appropriate, the book points to the relevant GitHub repositories that contain the examples' code.

The scenarios presented in the book are powerful demonstrations of why Docker is all the rage these days. Given the book's main aim though, the scenarios are simplified versions of the real world.  We will still have to wait for an account of the ups and downs of Docker in real, production usage.

James Turnbull is a prolific author with a handful of books to his credit. InfoQ reviewed one such book, "The LogStash Book", last year. "The Docker Book" is a valuable addition to a Docker's newcomer library. You can get much the same information on Docker's official documentation but the book is better organized and easier to read. On the other hand, if you are already comfortable with the technology, spend your library's budget elsewhere.

InfoQ took this opportunity to hear James' thoughts on this moving landscape.

InfoQ: When reading about Docker for the first time, it seems that server configuration management is now trivial. You just have to create a Dockerfile and you're good to go. Some people caution against that "illusion", without denying important benefits. What's your take on this?

James Turnbull: Using Docker versus using other tools isn't a binary decision. You don't throw away your toolkit when you adopt a new tool and no technology is a panacea. People who suggest either are usually trying to sell you something or trying to use fear, uncertainty or doubt to stop you using something else. Good developers and sysadmins use the best tools for the job. In the case of Docker and configuration management they are highly complimentary. For Docker-based applications it's awesome to use configuration management to build Docker images and maintain Docker hosts. Docker is another, albeit pretty powerful!, tool in your server management and application toolkit.

InfoQ: One thing is certain, Docker and containers technology are changing the landscape. What are the major paradigm shifts that Docker enables and that wouldn't be feasible without it?

James Turnbull: Docker is about making applications easier to build and portably deploy. It's not so much that anything specific isn't possible without it but rather that Docker makes hard IT tasks: like shipping code fast, having portable workloads, and building applications easier and simpler.

InfoQ: "The Docker Book" is a hands-on book, for someone who wants to learn Docker from first principles. But, after learning Docker's mechanics, one wants to learn about the bigger picture, how to adapt one's infrastructure to this new world. What resources should your readers consume after they finish reading the book?

James Turnbull: I hope the book is more than just an introduction. I take you through the basics but I also cover using Docker in the 'real world' as part of your testing and continuous integration workflow and to build and deploy applications and services. I also cover the basics of more 'datacenter' tooling like Fig for orchestration and Consul for service discovery.

Beyond the book is the Docker documentation - - and the excellent support resources in the Docker community - the #docker IRC channel on Freenode and the Docker-user mailing list.

InfoQ: Docker is not the only container technology out there, the concepts aren't even new. Why do you think Docker was able to gather this momentum and enthusiastic following?

James Turnbull: Docker makes containers easier to use. Docker owes a lot to prior art, without the awesome work in software like Solaris Zones or LXC containers it wouldn't exist. But for many sysadmins and developers those tools weren't readily approachable or easy to use. Docker is built to be simple and to allow a novice to quickly learn the basics and create containers for their applications.

InfoQ: What does the future look like for Docker?

James Turnbull: Pretty bright I think! Docker is growing an amazing ecosystem of tools and contributors. It's being embraced by a lot of people across the industry and we're starting to see features and integrations that allow you to easily build and manage complex applications. I think you'll see a solid percentage of existing workloads, not all!, but a considerable share of applications shift to be run in Docker containers.

About the Book Author

James Turnbull is the author of seven technical books about open source software and a long-time member of the open source community. James has authored books about Docker, Logstash and Puppet. He works for Kickstarter as VP of Engineering and is an advisor at Docker Inc. James speaks regularly at conferences including Velocity, OSCON,, FOSDEM, DevOpsDays and a number of others. He is a past president of Linux Australia, a former committee member of Linux Victoria, was Treasurer for 2008, and serves on the program committee and OSCON.

Rate this Article