Automated Builds: The Key to Consistency
If there's one thing software developers are good at (other than quoting lines from the movie Hackers), it's automating things that used to be done manually. Making life easier for everyone by letting computers handle tedious repeatable tasks, allowing people to focus on what matters to them is what we’re here for. However, development teams often neglect the one audience that would benefit the most - themselves.
In many small and medium sized software development shops, automated build and deployment tools don't exist. Building, staging, and deploying code are all done manually, as well as running tests, backing up old versions, tagging new versions, and any number of other repetitive activities. You may think these are all pretty easy tasks - after all, your IDE can build your project with a single button or key combination, and you can publish websites by simply opening up two windows and dragging a few files or folders. But when you start adding up everything involved with maintaining a codebase and application, and consider all the various applications a team generally deals with, those few minutes here and there turn into hours of wasted time.
Fortunately, this problem is easy to solve. Basic automated build solutions are easy to set up, highly customizable, and cost next to nothing. This article describes some of the motivations behind setting up an automated process, and some concepts you'll need to get started. Part 2 of this series will describe specific implementations for .NET solutions, but the techniques can be used in any environment.
What are we trying to solve?
Before we dive too deep, let's take a look at some of the problems we're trying to solve. Not all of these will apply to your organization - but if you take a close look at your own team, you'll see some of these, and likely more parts of your build process that could use some work.
Inconsistent builds: IDEs are wonderful for writing code. However, they come with a cost - things like framework/runtime versions, output architecture, debug/release versions, configuration settings, environment variables, and compilation options are often handled for you by the IDE or OS. This may seem like a good thing, but unless everyone has identical development machines, if your developers don't pay attention to these details, you may get different output from different developers using the same codebase.
Incomplete builds: Most of us are guilty of bad source control practices at one time or another. Maybe we forget to commit a bug fix, or forget to pull down the latest code to get other people's changes before compiling. When this happens in a manual build environment, code will be deployed that isn't the latest up-to-date version. We need a solution that will force you to always publish what's in source control, and only what's in source control.
Failing unit tests: Unit tests are an integral part of any good application - when done right, they help avoid problems where fixing one bug produces another one. However, just writing the tests isn't enough. You have to regularly execute the whole batch, something that is easy to forget when you have to do it yourself. Since nobody is actually running them, you could have failing unit tests and never know it.
Human error: Even a simple and carefully planned process is prone to human error. All of us have had our mouse finger stutter and accidentally click the wrong button, or have accidentally deleted the OS kernel (happens more often than you'd think). Or maybe it's 1am and we're half-asleep and accidentally open up the production server instead of QA. This is simply unavoidable - we're not perfect, and therefore anything requiring human interaction has the potential for mistakes.
Security: Server and network security is always something to deal with as a software team. Usually there are two extremes - either server access is limited to the point where nobody has access to anything, with a month of red tape to set up a new employee - or your servers are fully opened up to everyone, and any member of your team could take down your system with one bad click. As a developer, I generally prefer the second one because I can actually get stuff done, but I see the danger of doing things this way. Regardless of where your organization falls in this range, automation can only improve your process.
What do we need to do first?
Once you've identified your team's main pain points, you can design a solution specific to your needs. There's no one-size-fits-all solution - as long as you build something that reduces manual processes and makes things more consistent, you're headed in the right direction.
There are a few things you really need, however.
You've got to start with good source control practices. Regardless of whether you use SVN, Mercurial, Git, or TFS (just please don't use SourceSafe), you need to define things like your branching strategy, how you handle third party and internal libraries, and how to organize your own projects within your repositories. Of course your team has to be on board; when someone goes rogue with a small project, they can screw up the whole process.
One developer should take on the role of buildmaster. This person will be responsible for writing the build scripts, setting up continuous integration, and probably will do the source control setup and deployments. Unless you have a very large and complex environment, this work should only take up a small percentage of this person's time, so they can still do regular development work for the majority of the week.
Even though you hope to never use it, an emergency plan should be built into your process, just like any other mission-critical application in your organization. There's a good chance your build server will be a single point of failure - it probably won't be load-balanced, nor will it have a hot-backup in case the server spontaneously explodes. In cases like this, you want to make sure you can quickly build a new server, complete with configuration and permissions, and also have a Plan B for building your applications without a build server. Even though the goal of this implementation is to never do anything manually again, it should always still be possible.
Automation of your build process relies on simple, repeatable tasks. Build scripts are the first step. A build script can be anything: a batch/shell file, an XML-based collection of tasks, a home-grown configurable application, or any combination thereof. In the .NET world, Microsoft provides MSBuild, which is the command-line function for building a Visual Studio solution, with its XML-based project files. NAnt is another common .NET build script tool, similar to Ant, a popular Java tool. Others include Make, common in the open source world, and Rake, found in Ruby.
No matter how you choose to write your build scripts, you should find something that works for you and stick with it. For example, once you've found the best way to build a web application project, setting up a script for a brand new web application should be as easy as copying the script from your other project and changing a few names and paths.
Implementation will obviously vary quite a bit between operating systems and programming frameworks, but the idea is generally the same. Your script should do whatever it is you normally do when you build/compile/stage your code. Typically, this is going to mean running a compiler against your code, using specific compilation options, and putting the output files somewhere separate from the original codebase, to prepare them for deployment. Even in non-compiled projects, like static websites, you may have non-publishable content in your project, like test pages or debug scripts, which you want to keep together in source control, but not publish in this case your build script would stage out the files you do want to deploy, so you don't have to think about it every time.
Most software projects are going to contain more than one piece. For example, you may have a web application, but you also have a separate data library that's part of the same overall solution. For this, a single master script is the way to go. This script is the controller which calls each individual script one at a time. You'd have a script for each Visual Studio project, or Java package, or however your code is organized. Each of the individual scripts has tasks specific to just that one project, while the controller script contains any shared functionality.
Try to make your scripts as reusable and generic as possible. Keep your paths relative instead of absolute, project-specific information defined in one place, and reusable stuff in your master script. This will help make it easier to maintain, and help to build new projects later on.
Once your scripts are written, your projects can be compiled and staged with a single command. That's a great start, but a person still has to be there to call that command. Our goal is to remove human interaction, so continuous integration will take care of that for us.
As with build scripts, there are many different technologies to choose from, and many ways to organize your projects. But again, you'll want to find a solution that works for you, and stick with it, to keep consistent in your team.
Some popular choices are TeamCity, Jenkins, and CruiseControl (or CruiseControl.NET), or if you like multi-purpose applications, Microsoft Team Foundation Server can do continuous integration in addition to source control and handling the build. Each product has its own target audience, but they're all designed to watch your code and automatically run your build scripts, so you don't have to do it manually. The traditional strategy is for your CI server to watch your source control repository for changes, automatically pull them down whenever there's a change, then execute your scripts, which in turn builds your app and prepares it for release. However, as with build scripts, you can add custom tasks or behaviors to do whatever you need.
You can set up your unit tests, or any other automated tests you write, to execute as part of this process. Every time anyone checks in code, tests will be run immediately. For test failures, or even worse, compilation failures, you'll be notified something went wrong, so it can be quickly dealt with. Personally, I recommend shooting the responsible party with Nerf darts (a "dunce cap" is another popular choice), but the implementation is of course up to you.
Everything so far has led up to this moment. We're now at the point where we just commit code, and it compiles by itself, runs tests, and is staged and ready to release. The last step is to take the staged code, and push it wherever it needs to go.
Deploying a self-hosted web application is generally as straightforward as copying files from your staging area to your web server(s). This may mean manually copying the files, or putting together a simple batch/shell script. Since we're still trying to make life easy and automated, you'll want a better solution. Depending on your organization's security policies, you may be able to put together tasks inside your continuous integration process to copy the files over the file system or FTP. For a pure HTTP solution, you can look into products like DubDubDeploy, which will copy files server-to-server without the restrictions of domain security or file system access.
If you've got a packaged product, your final step is simply to package up your product. Your build script would have already dealt with creating installation packages, organizing your documentation, and anything else related to your releasable files. Now that you have a deployable product, all that's left is to take it and put it on your 3.5" floppy disks and box them up for distribution.
Putting together an automated build environment may take a little time - at least a few days, probably a few weeks before you've got something working the way you want it, and you'll probably still tweak things as you go. In the end, it's a bargain when you consider the time you'll save every single day going forward, and the frustrations you'll avoid by having a consistent process that everyone can easily follow.
Your organization is most efficient when everyone does what they're best at. Your developers will be writing the code, your buildmaster will handle the build and deployment configuration, and your build server will do all the repetitive tasks that it does very well. A smooth-running software process generally leads directly to a better product and a faster release cycle, both of which will make a huge difference in your bottom line.
About the Author
Joe Enos is a software engineer and entrepreneur, with 10 years’ experience working in .NET software environments. His primary focus is automation and process improvement, both inside and outside the software world. He has spoken at software events across the United States about build automation for small software development teams, introducing the topics of build scripts and continuous integration.
His company’s first software product, DubDubDeploy, was recently released - the first in a series of products to help improve software teams manage their build and deployment process. His team is currently working on fully automating the .NET build cycle.