Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ


Choose your language

InfoQ Homepage Interviews Jez Humble on Continuous Delivery

Jez Humble on Continuous Delivery


2. Jez needs really no introduction but he is the author of Continuous Delivery and he is also a principal at Thoughtworks studios and I think I got that right. So what does the principal at Thoughtworks do?

It is kind of a made-up name. We have principals that are the second highest people at Thoughworks, but what I do in particular is a few things: writing, speaking at conferences, speaking to our customers, doing some consulting on and off and helping out with sales.


3. It sounds easy but it is not. I kind of wanted to get into this is developers we need to focus on continuous integration and indeed InfoQ has covered that quite a bit, but so I wanted to kind of step back just a little bit and talk about the overall deployment pipeline. Can you give us an overview?

Yes, absolutely. One of the problems that we’ve experienced again and again working on large scale Agile projects at Thoughtworks is that you start doing Agile and development teams start doing Agile and is doing Agile, but you have these iterations but in terms of getting software delivered to production or released to users in the case of products or embedded stuff. There is this sometimes what is called "the last mile", the bit from dev complete to release, and that is often the most high risk and volatile part of a delivery process. And it goes back to the fact that it can be done with a bit of functionality (dev complete), but you are not really done-done, in the sense that you haven’t tested it in a production-like environment under realistic loads and you can find all kinds of problems when you do that around your architecture and so forth.

And then also at a slightly high level if it takes a long time to get stories released to users it takes a long time to get feedback on whether what you are doing is actually valuable. So one of the things that has been talked a lot about in the Silicon Valley is the lead-startup and this idea of creating a minimum viable product and iterating rapidly, so Continuous Delivery is essential when you want to get fast feedback on your business idea, on your hypothesis, in order to be able to actually iterate and produce something valuable.


4. So feedback is absolutely critical?

Yes. It’s all about optimizing for fast feedback and rich feedback and multiple feedback loops from users to the business at the highest level, but also these other feedback tools from the unit test from developers, from operations back to development and testing and so forth and so there are all these different feedback loops and it’s about very rich and fast feedback.


5. Last night at one of the receptions I had the pleasure of meeting two guys from the Department of Defense at the Pentagon and we were talking actually about continuous deployment, not necessarily delivery but deployment. First of all there is the distinction, right?

Continuous deployment is when you release every good build to production and that really is something that applies to web sites and software-as-a-service kind of systems. Continuous Delivery is kind of a superset in the sense that continuous delivery is about being able to release on demand and to be able to do push-button releases that are low risk. That may mean continuous deployment, but it may not and particularly in the case of products or embedded systems it doesn’t make sense do to continuous deployment in the sense of continuous release to users, but it still makes sense to do continuous delivery, but you still want to go as far the pipeline as you can, you still want to be continually taking builds and running them, production-like environments which in the case of embedded systems is in the real electronics as frequent as you can. So Continuous Delivery applies to any kind of system that involves software.


6. So that was their concern, was "am I required to deploy all the time?" and they are working in very mission critical systems that they frankly were very scared of the very thought of continuous deployment. So is risk one of the criteria for considering not deploying all the time?

Actually continuous delivery is all about risk management and increasing the stability of your systems. That really is one of the key value propositions apart from the feedback thing. So one of the things that I like to point out is that when Flickr required by Yahoo, Flickr were deploying 10 times a day more or less and Yahoo obviously had a more traditional processor and they did some stats and they worked out that Flickr actually had a high higher uptime than Yahoo because this stuff requires discipline. What you are doing when you implement a deployment pipeline is you are constantly validating your system against realistic scenarios and so that gives you much better transparency into the risk of the changes you are making and that is really important.

This stuff is really about reducing risk, it’s about increasing transparency, it’s about constantly validating what you are doing, so in mission critical systems that is really important, continuous delivery allows you to be constantly validating against what is actually going on in real life in getting you fast feedback on this stuff. So I think if you are doing mission critical systems this stuff becomes more important.


7. Right. So you are working the problems out as you go and your releases have less problems coming through the pipeline.

Exactly because what you are doing is all the practices and continuous delivery are about making sure that when you release them there is a problem that route cause analysis is really simple. So we have practices like building the binary once at the beginning and then taking that same binary all the way to through production. hat removes the binary as a source of problems in your release process because you know it’s the same binary you tested all these times all the way through the pipeline, infrastructure’s code, managing your configuration of your environments from source and testing it all the time that allows you to remove your infrastructure configuration as a source of risk the delivery process.

And the same is true with database deployments, configuration management, all this stuff writing scripts that use the same scripts that deploy to production that you use to deploy testing environments. It’s all about testing every single part of the release process from as early on in the projects as possible and as frequently as possible, so you remove those sources of from risk early on.


8. Ok, so that is test driven development really, right?

Absolutely. It’s basically applying the original principles of XP to the whole delivery process from the beginning to end. I think in terms of principles these is not a lot that’s new in the book, it’s not a massive original leap. It’s all stuff that we’ve been talking about from Agile and even before that. What I think is new is a bunch of practices and patterns, like the deployment pipeline, that we’ve proven to be successful over the last 5 to 10 years at Thoughtworks and also a bunch of tools that have come out in the space like Puppet and Chef, and all these tools that help you manage the whole stack. The F5 network people just released a VM that allows you to simulate your router low balancing configuration in test environment and now what we are finding is you can actually validate the whole stack way before you actually have to release the production.


9. Test Driven Development done wrong: talking about something maybe Unit testing, what is it maybe a bad example of Unit testing?

Unit testing I think is really important. One of the things I think is essential in order to be able to do these validations is that you have automated tests at every level whether that is the unit test level, the automated acceptance test level, you need to have these validations automated. One of the constraints on the ability to do continuous delivery is if you are doing many regression testing. My colleague Neil Ford has a joke that "when humans do things that computers could be doing in instead all the computers get together late at night and at laugh us" and I think in terms of regression testing that is absolutely true. So yes, you need tests at the unit level and the acceptance level in particular that are automated.

In my experience the only way to create an effective set of automated unit tests is through TDD and so that is a really key principle. And it’s one that is not sufficiently employed. I facilitated a panel on continuous deployment in Silicon Valley a couple of months ago and I was asking people to say how many people were doing TDD, there was something like 25% of the audience and I was pretty shocked because I thought that in Silicon Valley people are going to get it, but it’s not really the case, but it’s the same in enterprise software, it’s one of these means that it hasn’t really caught on as much as it should have done.

So I think TDD done wrong, the first thing is you should very strongly consider doing it because it’s a big problem, but there are really important practices around how to do it right, how to do it wrong and creating, particularly maintainable suites of tests, it is very easy to do TDD in a bad way where you create suites that aren’t maintainable, that break all the time not because the system on the test is buggy, but because you’ve read the test in a brittle way. One of my favorite books that came out in the last couple of years was Growing Object Oriented Software, Guided by Tests by Nat Pryce and Steve Friedman that actually won the Gordon Pask award a couple of years ago here at the Agile conference.


10. Recently on one of your blogs you talked about feature branching. Talk to us about feature branching.

Back in the old days, before distributed virtual control systems, one of the things we found, in many large organizations is that they would use branches for developing on and they would integrate into main line very infrequently at the end of the release and when that happened you integrate all these branches and you’d find the system. A)it would take ages to merge and that would be extremely painful and when you finished merging the system, it wouldn’t even work and that was the source of the lot of the pain of integration that the original continuous integration stuff was designed to solve.

So one of the things that happened is we’ve been beating this drum for many years - "Don’t use these source control tricks." Branching is fine, there is no problem with branching it’s just this practice of feature branching where developers don’t merge into trunk regularly which is problematic and we still see that today frankly, a lot, much more than we should. And so one of the interesting things that happened in the last few years is the rise of distributed version control systems and it’s something people at Thoughtworks have been using since the early days. I started writing that book back in 2006 and then I’ve used Git and Mercurial almost exclusively for the last 2-3 years.

So I am a big fan of DVCSs and then one of the points that a lot of fans of DVCS use to talk about the benefits of the tools feature branching and the ease of merging. So I am conflicted on this because a DVCS from a purely semantic point of view every time you are working on a developer work station it’s a branch, by definition. So yes, you are always working on a branch, so I like to say feature branching is evil, but that is a sound bite. The real point is you don’t want to keep to much inventory away from mainline. You want to make sure everyone on your team is constantly merging into mainline which in the case of DVCS is a conventionally designated central repository which is the start of your build pipeline.

That is where the binaries are created, they get taken into production. So really the point I trying to make is yes, you are working at feature branches, that is OK, the point is you want to be merging regularly into mainline and obviously when you do that you have to pick up other people’s changes as well and merge those in. But you want to make sure there is not too much inventory on those branches, not more than you can read and make sense of pretty easily. And there is a number of reasons why it’s problematic, not just because of the integration problem, but also because it discourages refactoring.

If a bunch of people have stuff on branches someone refactors, yes they should tell you when they are going to do it and that is important but if they tell you and you’ve got weeks work of stuff that is not merged, then telling you is great but it’s not going to solve the problem you don’t have to merge in a week’s worth of stuff.


11. And manual testing, when does that come into play and that is where human error comes in that is where we have the best chance to introduce errors, so automated testing versus manual testing?

Absolutely. Brian Merrick has this great diagram, his test quadrant diagram where he divides tests into four quadrants according to whether they are developer facing or customer facing and according to whether they validate the technical part of the system, the user facing part of the system, I am not sure if I’ve got it right but it’s something along those lines. But the point is that you’ve got on one part of the quadrant the unit tests and the component tests and in the user facing part you’ve got the acceptance tests and then down on the bottom right there is cross functional tests, security, performance, availability, so forth and in the top right there are things like show cases, exploratory testing , usability testing.

That stuff, show cases, usability testing and exploratory testing that is what humans are good at and that is what your testers should be spending most of their time on because that is where it needs imagination and cleverness and smarts. If you are using humans to do this other stuff on the left hand side that is really problematic because it’s error prone. The days where you could have this massive acceptance test scripts that people repetitively go through, those days are gone I think in the case of strategic software. In reality people still do it but I think as we start reducing the lead time and cycle time of our projects it’s going to be too big of a constrain. So yes, all this stuff on the left side should be automated.

We are starting to see more tools for doing things like performance testing and security testing in an automated way, it’s still hard but these practices are coming forward. Acceptance testing, creating maintainable suites of automated acceptance testing is still hard but again I talk about some of the practices around that, me and Dave talk about some of these practices in the book: "Continuous Delivery" and we are starting to blog more about this stuff. It’s something we know it’s possible because we’ve done it successfully on projects at Thoughtworks but the practices and the tools are still evolving.


12. So we are coming down to the commit stage, what happens when the tests fail?

Firstly you should know one of the things we talk about is the importance of getting feedback in 5minutes or less. And it’s not going to be comprehensive feedback but it’s going to be some indication, is my system is still working and obviously the first thing people should find out and then the next thing is that actually you have to stop and fix it. So there is this concept from lean which you pull when you see if there is a problem and everything stops. So at that point someone needs to pony up and say: "I am actually going to volunteer to fix this" and that is the main thing. People talk about continuous integration and people often think it’s about the tool. It’s not, it’s about the practice and one of the key thing is A) you have to get the feedback and then crucially people have to act on it. I am sure we’ve all been to places where there is a CI server and it’s red and no one is paying any attention to it, at that point you are not doing continuous integration.


13. That is really the importance of tools, I guess.

You have to have the tools. I mean the tools are useful but the important thing is the human factor.


14. Just taking a step further, we’re down to the release. What happens when the release fails and there is a couple of reasons why we’ve gone this far?

The point the release fails the first thing that happens it to restore service. You have to focus on restoring service. Certainly for anything critical, that is the first priority. But then the important follow-on to restoring service is doing recourse analysis and actually working out why that happened and being able to put guards in place to prevent it happening again, which certainly mean tests at some point so that problem can never occur again. Which again speak the importance of automating everything because those tests maybe on your code, those tests may also be about your infrastructure configuration. I mean being able to test your infrastructure configuration is one of the key things that comes out, the infrastructure’s code movement, being able to do BDD on infrastructure, using tools like Cucumber and Puppet, but yes I think first restoring service, then doing the recourse analysis.

John Allspaw did a really great talk, at Velocity this year which is very well worth checking out, he talks a lot about creating reliable systems and doing things like record analysis and some of the practices around making sure you can restore service fast and you can create resilient systems and so forth.


15. You mentioned Puppet; what do you think of the whole DevOps movement?

I am very excited about it. I think it’s very interesting. Just before I came here I was watching a talk by Patrick Debois who really founded the movement and Julian Simpson about some of the latest tool advances in this space and they are talking about Vagrant for managing virtual machines and Chef Puppet and so forth. DevOps has two components really. In my opinion DevOps has resisted definition on purpose I think. It’s kind of an anti movement in some respects because I think they want to focus on the cultural side of things firstly and this idea of development and operations and testing, collaborating very closely all the way through the delivery cycle because a lot of the problems in releasing software reliably is developers are measured on how fast they can deliver stuff. Operations are measured on the stability of the production systems. And so the come into conflict.

I think one of the primary messages is that continuous delivery is it’s not a zero-sum game. This is why I like to talk about the Flickr example because they are releasing more frequently, the stability of their production systems is also increased. So you can achieve both of these things and DevOps talks a lot about how you do that both through collaboration and also through the application of Agile techniques like infrastructure as code, test driven development and refactoring and so forth to infrastructure, so those are really the two kind of components.

There was a blog entry we talked about DevOps which talks about culture, automation, measurement and sharing and that is kind of the good way to think about it simply. I mean any time that you are doing continuous delivery in an organization which an operations department you need to be thinking about DevOps. It’s crucial to enable continuous delivery.


16. I just wanted to find out what you are up to, what you are currently working on or perhaps what you are interested in right now, maybe aside from continuous delivery?

One of the things that’s peaked my interest recently is the lean start-up stuff that Eric Ries has been working on partly as a factor of moving to San Francisco and actually seeing a lot of this stuff happening around me. So I did a talk on Tuesday at Agile 2011 about taking the lean startup to the enterprise. Obviously most of our customers at Thoughtworks are enterprises. And so it kind of fits nicely with continuous delivery because one of the key things about the lean startup is how do you innovate and produce a novelty of products and services under conditions of extreme uncertainty.

And that is the problem that enterprise faces as well, particularly now we have a boom in Silicon Valley, these people are going to eat enterprises for lunch if enterprises aren’t ready to respond rapidly to changing market conditions. So I think Eric Reis has come with this whole methodology and it’s stuff that’s been going on in Silicon Valley and other places for a long time but it hasn’t necessarily been codified.

Stephen Blank wrote a book, "The Four steps to the Epiphany" about customer development which is a key element of this stuff and then you can kind of think of it as a cycle. So there is a customer development side where you have ideas and you work out what you should build and then there is a continuous delivery part where you build stuff and you get feedback from users on what you’ve built and whether it’s valuable and that goes back into validating your ideas and iterating on your ideas and maybe finding out your whole business hypothesis was flawed and then pivoting your business ideas. So I think that it’s the application of the scientific method to the process innovation fundamentally and so I think that is something that I think a lot of enterprises could benefit from.

It touches a lot of different parts of enterprises from the PMO to delivery process to the work of operations, it fits into continuous delivery in DevOps, so that is one of the things I am particularly interested in right now.


17. Very interesting. This is a very non-technical question. The venture capital community, are they becoming aware of the lean startup, because I can envision a day when we are populating again with tons and tons of startups?

Absolutely, and that is happening right now. The VCs are very much involved with interest in all this. I mean Steve Blank has been working with the VCs for a number of years now and I think VCs are interested in it because it reduces the risk of their investments. If you can have a more scientific approach to the management of startups and you can get more success rate or faster failure at least that is very valuable. VCs are interested in it. And one of the points I’d like to make is that the business within enterprises is effectively acting as venture capitalists. They may not think they are but they actually are. One of the problems is that projects get measures in terms of their success based on delivering on scape, on time and on budget.

That is not actually a good measure of the success of the project. A good measure of the success of the project is, did we actually make money? That often isn’t even taken into account so I think, we’ve certainly worked on projects within Thoughtworks where we’ve done a great job, the customer has been very happy, we’ve delivered the project, we’ve delivered the service or whatever and then we come back a year or two later and it’s died because it turned out that people didn’t actually want it. And so until people actually start measuring this stuff--not just was it delivered in an acceptable way and meeting the constraints but was it actually valuable to people, this goes back to the first principle of the Agile manifesto.

Our first job is to deliver high quality, valuable functionality to our users. I can’t remember exactly the wording of it, but this is what enterprises need to be focusing on and people in the businesses who pay for these projects they are VCs, they don’t think about themselves like this but they are.


18. You brought Thoughtworks up and there is the old expression "eating your own dog food." I think it’s pretty well understood that Thoughtworks eats their own dog food, but can you shed a little light on how Thoughtworks integrates this in their own practices for studios?

Absolutely. I worked for the last three years within studios, I was the product manager for our continuous integration and release management tool called "GO" and we were very heavily into that to the extend that we built "Go" using "Go" and when we had a good build of Go it would redeploy itself in order to rebuild itself again, so it’s kind of metal. So we built that into our process, we have other tools, Mingle in project management, Twist for test automation and we have a big internal grid of boxes that we use for building and testing those using Go. On the delivery side of the products we very much strongly dog food, I mean we use Twist to write the automated functional tests for Go, we use Mingle to manage the projects and so forth.

Also we like to use those within Thoughtworks but Thoughtworks is very proud of its objectivity and of doing the right thing for customers. So Thoughtworks consultants will never shed our products, actually to the extend of overcompensating and saying we are not even going to push these tools, so it’s kind of interesting that the harshest criticism we get for our products is from within Thoughtworks. So people outside Thoughtworks are much politer about our stuff than other thoughtworkers, which is kind of interesting and you know if Thoughtwork is alike of what you’ve done then you’ve done something really good.

As long as you’ve got a thick skin and you can survive that criticism it’s really useful, so yes, absolutely. We have now a continuous delivery practice within Thoughtworks. Continuous delivery is something that appeals to executives. You can take this message to executives and they love it and they are interested in it and we’re actually doing customer development on developing this practice and delivering offerings to users. So we want to take this stuff, we want to use it to build our business and I think this has always been something we’ve been doing within Thoughtworks is being early adopters, testing this stuff out, finding what is valuable, but always subordinate it to doing the right thing for our customers.

We have a technology radar, one for this quarter that just came out where we talk about what is new in the technology landscape, to what extend it’s tested and to what extend would recommend it. The practices are solid around this, this is well understood technology, you can use it, or this is stuff that is new, trial it, don’t necessarily put it into practice. So we always want to make sure that we are doing the right thing.


19. It’s a learning experience for you to keep your head in the game basically, have there been any major failures in that process, in the products or maybe the continuous deployment and integration actually has been very successful?

Continuous integration is a kind of a no-brainer, it’s one of those practices that almost universally makes a big positive difference in terms of bang for your buck. You always need to be careful about all these practices. You can never say "this is always right for you," you always have to understand the context and the human element in particular. There is a well-worn saying that most failures are people failures not technology failures, all problems are people problems and that is absolutely true. Any time you recommend something you have to specify the context in which it applies.


20. And need to adapt to that context.

Absolutely, which is the route of Agile, again it’s a scientific method. One of the things that we say about continuous delivery for example is that you shouldn’t drop everything and implement the newest delivery. You should always be incremental about these things, take the pilot projects, something which is strategically important, but that people aren’t working weekends and nights to deliver. Try out these techniques, see if they work for you, apply the scientific method, you have a hypothesis you test it, you get the results back. It’s the cycle plan and you need to do these things in an incremental way. So that is very important, I think there is no sense in which any of these things you can say: "You’re going to do this, it’s going to be a silver bullet, it’s going to solve all your problems" that is crap.

And there is the tool vendor version of this which is if you our tool everything will be fine. We resist that even as a tool vendor someone comes to me says: "We want to do continuous delivery, can we have your tool? You always want to step back and say: "Listen, the most important thing is the organizational part of it. Let’s focus on that. Yes, the tool will enable it, but you need to focus on the organizational element.


21. Get the philosophy part right first and put it into practice. Just as an observation I noticed that the tool vendors have maybe a part of the solution and they are talking more to each other because each one enables the other especially when you are talking about ALM. Is that your observation as well with Thoughtworks or is Thoughtworks trying to solve the entire problem front to back?

No we are not, we are not big enough to do that. Yes I think there’s a lot of collaboration between vendors and there is this kind of marketing term "co-opetition" which I think kind of applies even though it’s a horrible mangling of the English language, but you always need to be aware of what is happening. All tools exist within an ecosystem and you’re rarely ever going to solve part of that ecosystem problem. So for example with our stuff within studios if you look at Go, it is kind of an orchestration layer around the deployment pipeline, but it relies on build tools, deployment tools, testing tools, infrastructure management tools, project management tools, it has to tie into all these things we are never going to solve this whole problem, we don’t want to, it’s not the right thing to do. We need to tie into that stuff and also all these tools exist within the context of a process.

A studio’s on main value promise is you have a process, we are not going to mandate the process. We are going to adapt to the process you use and if you look at some of the full stack tools where they do try and solve the problem front to end they sell you on these beautiful graphs and these reports that you get. They don’t work unless everyone uses the process that they mandate and this is problematic because it might not be the right process for you and it relies on the filling in all these boxes and as a developer there is nothing people hate more than having, before you check in any code you miss fill in these 20 boxes.

People get around it by filling it full of crap and then you get the grasp but they are meaningless. So our whole thing is we are going to adapt your process, you can still get the pretty graphs, but you can get the pretty graphs by configuring the tools according to your process and we’ll still gather that information for you. I think we’re kind of unique in that space with doing that, but we think it’s the right way to do.


22. We are pretty much at the end of our time, but as maybe a couple of parting words do you have any advice for maybe the enterprise architect community that’s taking a look at continuous delivery now?

Yes, I think in terms of enterprise architecture there are two elements: there is the element of actually what does it mean to be an architect and what is the value of architecture as an engineering practice. I think we always had advocated architects who are practitioners who know how to code and actually code and again it’s about the feedback loop. You put an architecture in place, that architecture will change as the product evolves, as the service evolves and for business plan changes that will have impact on the architecture as well. So it’s important that architects always being involved and actually writing the code to implement so you get feedback.

In an enterprise context one of the things we did recently which was quite of interesting was we got all the architects from all the regions together and we had a session where everyone just talked about stuff and really the benefit of that was getting a shared understanding in which all the architects of what they were doing in implications of it and then meeting regularly after that to kind of touch base and drive that feedback loop. So in term of the human elements I think architecture practitioners, in terms of actually coding is important. In terms of architecture there is this misconception that Agile removes the need for architecture and that is really not true.

Again what we need to talk about is just enough architecture up front and you always need architecture because the architecture apart from anything else, there is the standardization side of it, there is also the fact that architecture is what defines the cross functional attributes of your system, performance availability and so forth. And so it’s important to get that right, but you won’t get it right the first time, that is just the nature of complex systems. So again the deployment pipeline is useful because it allows you to validate your architecture from early on and if you can validate your architecture by running performance tests, availability tests, these kinds of things from early on, then you can actually make the changes to your architecture that you need to make sure that it’s actually the right one.

And again it’s just the feedback loop constantly validating and refining your architecture because those changes are expensive to make late. You want to make them early on when they are cheap to make which is part of their value property.

Oct 14, 2011