Docker Containers LiveLessons Review and Q&A
Addison-Wesley Professional’s Docker Containers LiveLessons aim to show what Docker containers are and how they can be effectively used. The 2.5+ hours video course is aimed to practitioners with no previous experience and will lead them through a journey going from installing the software to more complex topics such as orchestration. InfoQ has spoken with course instructor Chris Negus
Over the past few years, Docker containers have become a popular solution for process isolation. A container include a complete filesystem that hosts all that is required to run a given app, such as its code, runtime, system tools and libraries, etc. Being isolated from its host environment, a container provides a restricted and customized view of the hosting OS, which makes it independent from the host version of Linux kernel, platform distribution, or deployment model. This means they can be easily shared and moved around using the open source and locally deployable Docker Registry or the cloud-based Docker Hub.
Docker containers are a lightweigth solution that provide benefits similar to those of virtual machines, but with a lower overhead, since containers running on the same OS share its kernel, while virtual machines usually include their own full OS. Docker containers, though, are not equivalent to virtual machines for all use cases, e.g., they do not support live migration.
Chris Negus’s VideoLessons will show you how to work with Docker containers on a few Linux distributions starting with the basics, that is showing how you can install Docker and then how you can pull and push containers from and to Docker registries. As a natural step forward, Negus will then demo how you can run, stop, and restart containers, look inside them, and save/load containers’ images, which are the foundations of containers’ operation. Finally, the course will turn to the topic of creating custom containers, describing also a few fundamental rules to deal with networking, logging, and storage and it will introduce the topic of orchestrating multiple containers using Kubernetes and GearD.
InfoQ has spoken to instructor Chris Negus to learn more about his course and his opinions about the present and future of Docker.
InfoQ: Could you please shortly introduce yourself and describe your experience with Docker containers?
Chris Negus: I started using Docker about two years ago at Red Hat, when I was assigned to help write about how to use Docker with enterprise-quality Red Hat operating systems. After doing many of the first articles from Red Hat on using Docker, Kubernetes, and Project Atomic, I now spend most of my time helping Red Hat develop best practices for building and deploying Linux containers in the Enterprise. I also work with the Red Hat Container Development Kit, which helps developers use OpenShift and Red Hat Enterprise Linux to develop containers on their Windows, Mac, or Linux laptops and desktops.
InfoQ: For readers that are new to this technology, could you briefly explain what Docker is and who should be concerned? What are containers?
Chris: Docker is an open source software project that defines a format and tools for packaging and running applications. Linux applications have traditionally come in packages (RPM, .deb, etc.) that, when installed, spread files across the filesystem of the host computer. Often, an application has dependencies that require that even more packages be installed on the host. The more software you add directly to a host’s filesystem to get a specific application running, the more risk you take that there will be conflicts with other applications running on that system.
With Docker, an application and its dependant software is stored and transported together in a “Docker image”. Instead of installing a Docker application, the Docker image is made for you to just run it (or if its not there, to automatically pull it to the local system and run it). The application, its dependent software, and metadata that describes how to run and use the application can all be included in the image.
In the container model, the host computer can remain fairly generic. Conflicts between running Docker images are minimized since each has its own filesystem, network interfaces, process table, and other features that can be kept apart from each other. Docker containers can be spun up quickly, run alongside applications that might otherwise conflict, and removed cleanly.
Containerization offers developers a promise of more complete control of their applications and the environments they run in, and new tools to develop them (such as OpenShift). System administrators now have new entities to manage (container images), differently configured (often slimmer) operating systems (such as Atomic Platform and Atomic Host) and new tools for deploying and managing containers (such as Kubernetes).
InfoQ: How is that different from a virtual machine?
Chris: A virtual machine contains an entire, bootable operating system. For a Linux system, that means starting a kernel and at least a handful of system services. A container, on the other hand, usually runs one process: the application it was made for. Containers often have other utilities inside, such as the bash shell, that you could execute to interact with a running container. But you would rarely have more than two or three processes running inside a container at a time.
The size of a typical virtual machine may be several gigabytes. Because a container only needs to hold what is required to run and manage an application, a typical container may only be a few hundred megabytes. A virtual machine, in general, consumes more processing power as well.
While some people talk about containers replacing virtual machines some day, virtual machines and containers can be used together. Keep in mind that a container still needs an operating system to run on. So a common practice is to launch a virtual machine, using something like a Red Hat Atomic Host, in a cloud environment and use it to run multiple containers.
InfoQ: Docker can be said to be changing the way applications are developed and delivered. What are the factors that are driving this change forward?
Chris: There’s an overall trend in the industry to look at datacenters as vast, expandable pools of compute, memory, and storage resources. If more resources are needed, an extra host can be plugged in and started up quickly because less individual configuration of each host should be required. What that means to developers is that they need to do more jobs that used to be done by system administrators. Likewise, administrators must also take on new roles.
Developers are expected to not only provide the software needed by an application, but must also provide information on how the different pieces are interconnected in a runtime environment. For example, a complete application might include multiple containers, providing many different interconnected services. Instead of an administrator setting up physical networks and laying out which machines provide which services, all of that can be configured virtually within the application that the development team delivers.
So, when it gets down to it, the developer doesn’t deliver an RPM package and a list of things that the administrator must do to get the application inside it set up to run. Instead developers might hand off a set of containers and a set of json files that define the services and virtual networks the application needs. They need to define the order in which services are started, the exact executable that runs, and direction on what features (such as storage volumes) the container must access from the host.
InfoQ: Where does Docker fit in a CI/CD pipeline?
Chris: Docker fits well into the continuous integration/continuous delivery model because so much of the code that needed to be pushed through testing, staging, and production is contained within the Docker images themselves. In theory, this should make the environment a tester must set up on a laptop, or a system integrator needs to create in a staging area, less complicated.
Projects such as OpenShift, have been designed to follow Docker image development through its entire lifecycle. By tightly integrating the OpenShift Web UI and oc command (CLI interface) with orchestration tools, such as Kubernetes, a developer can both build containers and check how they will function when they are ultimately deployed to a local datacenter or cloud environment.
One of the challenges to adapting your CI/CD lifecycle to containers is the simple fact that containers are different from RPM packages, ISO images, or other common ways of distributing software. So in a new CI/CD tooling, ways are still being developed to verify what’s in a container, validate where the container came from, and determine best practices for monitoring, upgrading, and delivering multi-container applications.
InfoQ: Another use of Docker is doing hyper-scaling, what is your view about that?
Chris: Mesos provides a means of creating massive physical datacenters from which you can carve out virtual datacenters. Containers represent one type of workload that can be delivered across a datacenter managed by Mesos. If your company is not already using Mesos, and it doesn’t need to scale up to ten thousand or more computer, you might not want to deal with the complexity needed to set up a Mesos datacenter.
An alternative is to manage containers with Kubernetes. The Kubernetes project, which was created by Google and is the result of more than a decade of its container deployment experience, is on track to be able to scale up to over a thousand nodes in a single cluster. Over time, if you need to go beyond that, you can create multiple Kubernetes clusters (Google manages many Kubernetes clusters using software from its Borg project).
Starting with Kubernetes doesn’t prevent you from eventually moving your container workloads to Mesos, however. The Kubernetes on Mesos project is in the process of developing a Kubernetes framework to run on Mesos.
InfoQ: In which ways is Docker helping companies to move to hybrid clouds? Are there any other use cases where you see that Docker is growing?
Chris: Every major cloud provider and infrastructure as a service project (such as OpenStack) that centered around deploying virtual machines is now adding a container deployment strategy as well. So the Docker project doesn’t have to do much to help companies to move to hybrid clouds. Potentially any system that can run the docker service could be part of a company’s hybrid cloud strategy, provided they have orchestration tools to handle the deployment and management of containers on those systems.
InfoQ: In your LiveLessons, you start from the very basics, such as installing the software, creating a container, starting it etc., and then move on to more complex cases, such as orchestrating containers. What suggestions would you give to a beginner practitioner? What are the most important concepts, practices, etc. to be successful with Docker?
Chris: I’ve always said that the best way to really understand a new technology is to put your hands on it. That’s why I show how to set up a Linux system for containers, then demonstrate how to run and investigate containers. So, my first piece of advice is to try it.
As for concepts to understand, one that I touch on in the LiveLessons and in my book “Docker Containers,” is the concept of namespaces. While a container is running live on your system, you can open a shell inside that container and view its process table, network interfaces, and filesystem. Seeing how the view from inside a container differs from the view from the host should give you some insight on how containers operate as separate entities from both the host and from other containers on the same host.
Beyond namespaces, I think networking and storage can be among the most challenging topics to deal with in Docker as you begin creating more complex, multi-container applications and more automated deployments. You will soon see that you need tools beyond what the basic Docker service can offer. Tools such as Kubernetes can help connect together sets of services that span multiple containers. Setting up OpenShift as your container development platform can help lead you through your choices for configuring storage, networking, authentication and other features your applications need to run.
About the Interviewee
Christopher Negus is a certified RHCE instructor and principal technical writer for Red Hat, Inc. He is a Red Hat Certified Architect (RHCA), Red Hat Certified Instructor (RHCI) and Red Hat Certified Examiner (RHCX), and has certifications that include Red Hat Enterprise Virtualization (RHCVA), Red Hat Clustering and Storage management and Red Hat Enterprise Deployment and Systems Management. Christopher has authored dozens of books on Linux and open source software, including the Linux Bible, Red Hat Linux Bible, Linux Toolbox series, Linux Toys, and Live Linux CDs. At Red Hat, Chris is currently working on development projects that include technologies such as OpenStack, Red Hat Cloud Infrastructure, and Linux containers in Docker format. Earlier in his career, Chris worked at AT&T Bell Laboratories on the UNIX System V development team.