Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ


Choose your language

InfoQ Homepage News container-diff - an Open Source Tool from Google for Analyzing Differences between Docker Images

container-diff - an Open Source Tool from Google for Analyzing Differences between Docker Images

Google released an open source project called container-diff which can be used to analyze differences between Docker images. It supports file-system differences and is aware of changes brought about by the apt, npm and pip package managers.

Dockerfiles are used to create and make additions to container images. A change in the Dockerfile followed by a rebuild leads to the creation of a new image. Differences between Dockerfile versions can be easily seen, usually by using the source control system’s diff tool, since they are plain text. However, it is difficult to visualize or list down the exact changes that occurred in the image as a result of a new command in the Dockerfile. This can become a challenge when the application being packaged in the image has dependencies on specific versions of other software, and there are downstream dependencies that make it complex to track what will get installed as a result. Untracked dependencies can also lead to unnecessary bloating of the image, leading to slower download times.

The container-diff tool computes "semantic" differences - which means that it presents the diff in a format that the user can understand and take action on. The actual low-level differences combined with knowledge of the package manager is used to derive this. container-diff supports Python’s pip package manager, the apt tool for Debian and derivative Linux distributions, and the node package manager for node.js packages. In addition, it can also analyze differences between file versions in the filesystem. One, some or all of these package managers can be analyzed at once.

The images to be analyzed can be specified as a local Docker daemon path, a remote registry or a filesystem path. The latter is useful when an image has been exported using the docker save command. The Docker daemon need not be running to run container-diff. It can also output a single image’s history.

Other tools that have attempted something similar include Anchore's diff tool and Project Atomic's 'atomic diff' command. Docker’s own 'docker history' command can display the individual Dockerfile changes only, which can be seen by just looking at the Dockerfile. Some reverse engineering can reveal more low-level details but these are hard to translate into events like which packages were installed. Project Atomic’s tool can show differences between the filesystems and is RPM-aware, i.e., it can show differences in terms of what RPM packages were installed. Additionally, atomic diff supports differentiating between two containers, a container and an image, and two images.

According to the article by Google, container-diff can be made part of the development workflow by providing automatic changelog management and integration with continuous integration systems, especially since it can produce output in JSON format. container-diff supports authentication via the docker-credentials-helpers package when the image resides in a registry - either self-hosted or a managed one like the Google Container Registry. This package utilizes native programs for storing Docker credentials, e.g. osxkeychain in OSX.

Rate this Article