Lessons From A DevOps Journey
Matt Callanan has been pushing the boundaries of Agile software development for over six years and most recently, when challenged to deliver an upgrade to a large commercial system to an Australian financial institution, he extended that journey to DevOps. He recently shared his experiences in a talk at the Agile Development Practices West conference in Las Vegas entitled "Lessons From A DevOps Journey". InfoQ caught up with Matt prior to the conference to find out more about his experiences in DevOps.
InfoQ: What got you passionate about Agile development?
Going back to 2006, I was working on a large Agile project as a developer. I had never experienced Agile before then. I was thrown into a team of highly disciplined XP programmers and for the first couple of weeks it was a bit of a mind melt! Being involved in that team of really smart guys and being immersed in their culture I found I was learning a lot about TDD, refactoring, pair programming and fast feedback as well as interacting much closer with testers and business analysts and a whole raft of other things that I had never been exposed to before. As a result, that made me quite enthusiastic as I could see the maintainability of our code and the deployability and sustainability of our system was so much better than where I had been before. That made me quite passionate and pretty much turned my career around at that time.
InfoQ: How did your Agile journey evolve to DevOps?
Back then I was exposed to the highly collaborative culture of testers and developers working closely together using Agile practices. Fast forward a couple of years, I started work on another Agile project that this time revolved around the installation of a vendor product. We had to work closely in that team with some operations personnel that hadn't been exposed to a lot of the development practices and Agile skills that we brought from previous experience. Working very closely with those people at that point in time helped me realise all the hard work that they do and the discipline that is required in their particular skillset. Being paired up in a highly collaborative team with operations personnel and working with System Administrators and DBAs made me realise that Agile is not just about development and testing but it goes further than that. It's about collaborating with all different aspects of the project lifecycle. If you can incorporate the idea of collaboration out further and start collaborating with operations personnel then not only do they gain the benefit of Agile development practices like reducing feedback loops and increasing testability, but developers also gain the benefit of the operations perspective such as the challenges of keeping production systems running reliably and the discipline that’s required in your code to support that. It's a win-win for both sides.
InfoQ: Through your experience, what is your definition of DevOps?
Patrick Debois said in a recent interview that there is not one definition for DevOps. It can mean different things to different people. For me, it boils down to collaboration and automation. Ultimately its about creating a collaboration culture in your teams so that your developers aren't just thinking about how to throw a distributable package across to the operations guys and hope for the best. It's about working closely with them and getting to the point where you can streamline the path to production in a way that helps them out and helps you out. Its all about the collaboration and often automation is the tool that is used to coordinate those activities so that you can keep your systems in a place where you have fast feedback and repeatability of the installation across different environments.
InfoQ: In your talk at the Agile Development Practices West conference you are encouraging people to stop procrastinating and just start by using simple tools. Can you tell us about some of the tools you have used and why you chose them?
We had a home brew approach to automation in the area of configuration management that typically today you would do via Puppet and Chef. We were doing a lot of Windows automation which was not supported by Puppet and Chef at the time, so what we did was make the best out of what we had. Rather than saying "we don't have any tools that will help us out in this area so let's just give up on automation", we used a combination of things like Batch scripting for Windows, Bash scripting for Unix and Groovy scripting for much of our Windows automation too as we were a Java shop with a lot of Groovy skills in the team. As a result, we were able to combine the best of both worlds as Groovy was a powerful language that we could use half way between Batch and Java. It provides an easy way to invoke command-line programs with much more power than a typical batch script (if you have ever tried to write a for loop in a batch script, then you'll know what I am talking about!). And if you write all your installation automation code in Java, then you’ll need to compile it, JAR it, wrap it in a batch script with classpath settings. All of which slows you down when you’re testing your automation out on the destination environment. With Groovy we could run our scripts directly from the command line. We essentially had a home brew approach to automation that we just built up with the tools that we had at hand. I would encourage others to start their journey with what they have rather than spending six months in meetings discussing potential tools because there are so many benefits to be gained from the lessons you learn during early automation attempts.
InfoQ: What about continuous integration and deployment?
The enterprise we were working with had a tool called Tableaux that was available. Essentially it is like Capistrano for Ruby for doing your deployments to various boxes. It has a web GUI that provides the deployment mechanism to run various scripts and activities on remote machines. We built up a series of release kits (as it is called in Tableaux) and bundled up different tasks that we could run together that set up the dependencies, installed the product and ran tests on remote servers. Through trial and error we built up a suite of these tasks that we could run together to produce a continuous integration build that essentially exercised the entirety of our operations code. We had a lot of process around testing our Java code for custom interfaces to other systems. We took the Agile approach of testing our code often and applied that to our operations scripts and installation procedures. We implemented a nightly build and found ways to automate the installation of everything that gave us fast feedback on any of the activities that might cause us problems. We had dozens of changes being introduced every day to our automation and any of those individual changes could potentially cause unexpected problems when played together with other scripts. If you don't run your automation as soon and as often as you can, you won't find the problems out for a few weeks or months down the track and at that point it is too late. When you are at that critical point where you need to deploy straight away, it's not the time to be finding problems. We borrowed a lot of the "fail fast, fail often, feedback is key" principles from Agile development and applied that to our operations code.
InfoQ: You appear have used version control extensively, what were some of the approaches and benefits?
Version control is one of the most important concepts that DevOps brings to operations and having the ability to see the history of changes, being able to track them and see how code has changed as well as being able to do merging, branching and various other development activities to your operations code is a real bonus. A lot of operations teams are not used to working in that kind of fashion, so that was an important practice that we brought across with us. We stored the vendor product in Subversion and that gave us the benefit that whenever we received a patch, update or change to the software from the vendor we would commit it to version control and over time could see an exact history of changes that had come from the vendor and at what point in time we had integrated with those. The typical vendor install approach would be to install it manually on the box and apply the update changes one over the top of each other so you don't necessarily see what has happened over time. Version control gives you a sense of traceability and accountability of the changes that have been applied to the environment. It also acted as a file server that our scripts could pull the distributable package down from, and as we installed customisations and various other configurations alongside, we could also see how they were affected by patches and updates.
InfoQ: Did the audit trail help you resolve conflicts over time?
It was great to be able to go back in history and see exactly what had changed. Often with a vendor relationship there can be some uncertainty about proving whether you have installed the patch, whether you installed it the right way and whether you applied the right version number. Being sure that when you say that you have applied patch number 9 and that it is patch number 9 running in the system can be useful in those cases.
InfoQ: How did you evolve the use of pair programming to improve the collaboration with operations staff?
We took the approach that pair programming was not just an activity for code development but also an activity for anything that would be part of the upgrade project. The operations people as part of our team brought their skills from their day to day jobs of supporting the production environment and the developers brought Agile development skills and disciplines of highly tested code. We took the approach that any tasks could be paired on, so developers could, for example, pair up with someone in operations to setup an F5 configuration, to install IIS, to have conversations with a DBA or to configure automated message queues. Similarly for typical development tasks, we would pair up operations with developers so they could understand some of the code being deployed and some of the practices that went into the development so they could understand some of the long term sustainability, maintainability and testing practices. We took the approach we were a highly collaborative team and that different individuals have different skills, but anyone was able to work on anything and in that way we were able to skill up a lot of people in areas they would normally not get exposure to.
InfoQ: Ultimately you say it is about delivering a product to the customer and not about delivering automation. What were some of the benefits or outcomes by taking this approach that your customers saw?
It's not about automation. Automation is something that we relied on quite heavily, but one of the key tenets of DevOps is about delivering business value sooner, and as a result we saw a number of benefits. On the final go-live deployment night, it was agreed that it was one of the most successful deployments that the operations team had ever seen due to a lot of the discipline that had gone in to the project. Previously, operations staff on the old system were getting called out for after hours support almost every night to fix problems or hand hold jobs that should have run automatically but after go-live the number of callouts reduced significantly. Patches were now generally quicker to apply because the old environment was installed manually resulting in environments that were out of sync and people that were too scared to touch Production. With Subversion history and a build running every night testing the installation we had confidence of changes before going to Production. Ultimately we took a deployment process that took up to five days down to one day and as a result gave the business a smoother ride due to confidence in deployment.
InfoQ: You home brewed a lot of tools for this particular project, but what are some of the exciting tools you see coming through the DevOps community now?
If we had our time over again, we realise that we spent a lot of our time in automating the configuration management side of things, but now there are tools like Puppet and Chef that do a lot of that heavy lifting for you. I would definitely be looking at evaluating those and incorporating them as part of our procedure as it allows us to abstract ourselves away from a lot of the nitty gritty details. I think enterprises are starting to see the value of these tools as there is a lot of manual procedures around installation that can go awry quite frequently. There are a lot of reusable templates that are coming out nowadays for Linux and even Windows where people in the community have already gone and done the hard yards around the installation of various tools and technologies and they are contributing their templates back to their community where you can quickly leverage what they have done and download them from Puppet Forge or Chef's OpsCode Cookbooks and reuse and adapt them for your environment.
InfoQ: What is your final advice for people in Agile teams who have not thought about making the journey or to those who have been considering DevOps?
For us, DevOps was about pushing Agile further and taking the principle of individuals and interactions over processes and tools straight from the manifesto and applying it not just to development tasks but to all other areas within the project. You want to get the product out as early as possible and test your assumptions, so you need to apply the Agile principles beyond development and testing to the last mile. Fail early, fail often and collaborate because it is about people more than tools and find ways to take the feedback cycle out to the boundaries. Also, I recommend the DevOps Cafe podcast and the DevOps Weekly email newsletter as a great way to immerse yourself in the DevOps movement and keep up-to-date.
In his presentation, Matt outlined the top seven lessons from his journey:
- Get the skills mix right
- Co-locate where possible
- Automation provides leverage
- Fail early, fail often
- Don't need latest tools
- Keep automation options open
- Know when to automate
For more information on DevOps, Matt maintains a list of resources on his website.
About the Interviewee
Matt Callanan is a freelance senior software developer with more than twelve years of experience in finance, telecommunications, and security industries. Matt has a passion for code quality, maintainability and testability, and a fascination for team productivity. Matt saw the Agile light in 2006 while working in the finance and security industries on development projects with the XP dials turned up to eleven. Since then, he has worked—both as developer and agile mentor/ScrumMaster—with various software development teams helping them find more efficient ways of working together and shaping codebases in the direction of testability, maintainability, and supportability.
Embedding Ops members in Dev teams