Martin Fowler and Paulo Caroli on Continuous Integration and Deployment

1. My name is Paulo Silveira and I work at Caelum for 6 years and I’m here with Paulo and Martin. They will introduce themselves.

Paulo Caroli: My name is Paulo Caroli, I work for Thoughtworks for the last 4 years and I’ve been in ThougthWorks US, ThoughtWorks India and now ThoughtWorks Brazil.

Martin Fowler: I’m Martin Fowler, I also work for ThoughtWorks. I’m a general Loudmouth writer, speaker and blowhard.

2. A lot is being discussed about acceptance tests and it’s really useful to improve the quality of our software, but nowadays a huge problem is that most of the acceptance tests take a lot of time to run. When you have continuous integration and if you create your build strategy to run all of these acceptance tests, based on a tool like Selenium it will need to open a web browser and run all the tests and it will take a lot of time if you have hundreds of them. The feedback will be a lot slower than what would be acceptable. How can we deal with this kind of builds?

Martin Fowler: The key answer to this, which has been true since the very first extreme programming project is to use a build pipeline. That is to have two layers of test, because that way you could have a fast moving set of tests that can run quickly, get under that 10 min mark and give you rapid feedback. Usually, in order to do that, they'll have to be to fine grained, unit style tests with any slow connections like databases and kind of things stubbed out.

You are not going to typically do those through the browser with the possible exception of JavaScript oriented stuff, but most of the stuff won’t run through the browser. Then, you have the acceptance test that do run through the browser that do have things like databases connected, running on a separate stage of the pipeline. And yes, they are going to run slower. They might run in 30 minutes, they might run in a couple of hours, but the point is your basic commit cycle is based on that first set of tests.

Paulo Caroli: I would add another aspect, one thing is the pipeline, so you have different sets of tests that run at different times in the pipeline. Another thing you can do is you can run the tests in parallel. This was more difficult earlier on. To run the tests in parallel, there are two important things. One is the design of the test; make sure there is no dependency of one test on the other, so there is no sequential need for the test, they can run in parallel.

The second is the power to run in parallel. The power can be machine power, to have enough machines so you can parallelize stuff. Another one, for example, is the toolset we use these days. You mentioned Selenium; we have been using Selenium Grid for a long time. As long as the test can run in parallel start Selenium Grid, they run fine. The tools that you use for continuous integration evolved a lot.

For example, Go has the capability where you can organize the build pipeline but you can also select what you want to run in parallel - that’s another way, even though there are a lot of acceptance test in one of those states, you can run those in parallel.

3. What would be a short time for build to have the feedback?

Martin Fowler: For the commit stage, you want 10 minutes or less and the ThoughtWorks teams that I’ve talked to the last couple of years seem to find 10 minutes a bit long. I think that’s the point at which you'd be tipping over. The later stages can cycle much more slowly, depending on your needs. Following upon the point Paulo's making about the parallel stuff, a very good example of where that has to be done a great deal is Mingle, which is a collaboration tool, because it’s very heavy on JavaScript usage.

It’s got all of these virtual card walls that you’re moving around, lots of JavaScript in there and has to be run on every combination of operating systems and browser that people are likely to run into. Those tests are done with a lot of parallelism, they still take a long time, but by cycling them after the initial commit test, you still get most of your feedback early on.

4. Polyglot programming. There is this kind of movement and this hype about learning some languages, that you should not be focused only on one language or on one technique. How fundamental do you think is this?

Martin Fowler: It’s two points here: one is people talk about polyglot programming. It’s usually the notion that you are using multiple languages on a project. Often that’s a primary general purpose language and multiple DSLs, which we do all the time, anyway, but it can sometimes mean multiple general purpose languages to handle different parts of the problem.

Going a little bit back to the last question about functional tests, one of the projects that we’ve got going on here in Brazil for a large web site the core work that’s been done on the system is being done in Java, but all the acceptance tests are scripted using Ruby with a Ruby driven Selenium.

Because the functional testing found that’s a more efficient way of working. That idea of using different languages for different strengths, that’s what polyglot programming is about. There is a separate point, which is should programmers learn multiple languages in order to become better, more capable programmers?

I’m very much in favor of this. I think the pragmatic programmers advice try to learn a new language every year, try to make it a different language (if you already know Java, learning C# doesn’t count) and not necessarily a language that you expect to use day-to-day in your production programming work, but one that will stretch your mind in a different direction, open new possibilities.

I found when I was a consultant in the early days of OO when the two primary languages were Smalltalk and C++, I had a huge advantage knowing both of those languages because I’d bring ideas in Smalltalk world into C++ and occasionally I’ll bring a useful idea from C++ into the Smalltalk world. But knowing both, gave me a really good picture, which was even more valuable when Java became popular.

5. The next question is about your upcoming book. Your book about Enterprise patterns was a huge success, I think especially because all kind of systems needed this kind of information. Do you think that the DSL book will have the same impact? For example, will it be used on almost every kind of system?

Martin Fowler: I don’t know. The purpose of writing the book is to explain the techniques behind domain specific languages to people, so that they are more approachable, that it’s easier for people to do them. Because essentially I feel that at the moment a lot of people hold back because they don’t know these techniques. An interesting question is once these techniques become more and more known and better understood, will that lead to more usage of DSLs? I don’t know.

I think it’s certainly something worth trying, particularly when we talk about DSLs that are readable by business people. Because that way it opens up the communication channel and I think that can be very valuable. But with anything like this, with a book like this, it’s very speculative, you’re sitting there you don’t know what kind of impact it might have.

It’s kind of like when I wrote the book on refactoring - I didn’t know what kind of impact it would have. It’s would be great if it’s very influential, but all you can do is try and see and if not, then that's the way things go.

Paulo Caroli: I do think this book is going to change a lot how we do things. The main reason I say that it’s because I remember the days when the Gang of Four Book on Design Patterns came out. Design Patterns were used, everybody was doing something similar but there was no name for it. It was not clear for us but "Hey, guys! Let’s pay attention to this."

The first time I came across DSLs and started reading Fowler’s ideas, I said "Oh, that’s what you call an internal DSL and that’s what we call an external DSL. Wait a second! Those things that we were using, those were DSLs. I think it’s going to be clear for us developers in the community what we use all the time and what we should be paying attention to, what’s there that we didn’t know how to use.

Martin Fowler: We have a long history in software business of coming up with the wrong overcomplicated distributed systems approaches that sounded like a good idea in theory and never worked terribly well in practice. The web is the great example of something that had many theoretically very bad ideas and worked extremely well in practice. The interesting thing is whether we’ll see that same thing flow with REST.

We work a great deal with folks like Jim Webber or Neal Robinson who are very strong advocates of the REST approach and they've got a long experience in dealing with this whole interoperability stuff. I certainly find their arguments quite convincing. We too easily fall into the trap of complicated and in the end very heavily coupled ways of doing distributed stuff.

REST offers an interesting approach, but it’s quite different and I think has a lot to go for it. We'll see, but we never know until we actually thrash these things out. Certainly, me and the people I work with are much more inclined to bet on REST. I can see that work out next.

Paulo Caroli: As a developer, I would say "Remove all the vendors, think about when you have to handle some existing code, what do you think is going to be easier for you? The same hype I think happened if you think about CORBA or EJBs later on. Great ideas sound really cool, the tools are really amazing, but when you put in that approach you have to handle some existing code you go like "I wish none of these were here. I wish things were simpler". REST is showing us there are simpler ways to do things.

Martin Fowler: But there are still some definite challenges in trying to understand what makes a good RESTful idea and people are still learning how to describe some of the kinds of things that make up that. I am much more encouraged along that kind of direction than I’m encouraged to the whole WS-*.

7. About the Agile Manifesto, there are some principles and one of them is discussed in the Brazilian communities which says that we need to early and continuously deliver the products. How early and how continuously should we deliver?

Martin Fowler: As early and as continuously as possible. You want to go into production, as rapidly as you possibly can and then you want to be cycling updates as quickly as you can, as well. There is still attenuation here from how rapidly the customer wants to get something, wants the minimal thing that they can go with first. There is definitely a tension and trade off there, but you really do want to be getting through that cycle.

Even if you can’t actually put things into live production, you want to be able to produce production quality stuff that could go into production. I think that’s a really important test, because traditionally where the waterfall stuff has fallen down is because it’s left complicated difficult to predict processes like testing and integration late on in a project where it causes the most hassle.

8. It’s the last mile.

Martin Fowler: Exactly.

Paulo Caroli: One important thing to say about the word's "early", it’s not "sloppy". It not about "We got to put this in production, whatever it is". It’s early but of good quality, it’s not sloppy work.

Martin Fowler: As early as possible, but always the top quality.

Paulo Caroli: In fact, if you follow what Martin said of continuous integration, which can take you to the next level which is continuous delivery that helps you to automate and build this process where things slowly move from one environment to the other. We have automated stuff as much as you can, you verify that it is of good quality and goes to production smoothly, if possible, every day, if not every hour, depending on the system.

Martin Fowler: A big inspiration for me it’s been the guy who coined the term continuous integration was Kent Beck and after we worked together at the Chrysler eXtreme Programming project he went on and did some further work in Switzerland, with an insurance company. This was in the ‘90s, but they put production code into production every night.

He used a much more sophisticated programming environment called Smalltalk than we have these day, but the point is he got into that regular continuous cycle that worked very effectively for them. I think more and more places can do that, it’s not just the cool web kids that can do this. More organizations are capable of doing that if they can learn how to take up the kind of techniques that we like to about here.

9. There is one problem that you talked about - the database evolution. If you have a database evolution and there is some kind of mess, you need to go back and you can lose a little bit of that. How could we handle this? Would we need to work like this?

Martin Fowler: There are ways you can deal with that. One is to time your database evolution to make the space for new changes in one cycle and then actually use the space in a different cycle and that way each of those things are easier to back out. Another thing is using a technique called event sourcing, which you capture all the input events in such a way that they can be replayed against different versions of the database.

There are techniques you can use to get around this kind of problem and certainly it’s not impossible, it’s stuff we’ve done. The Red Dream was actually taken from one of the projects that we did in London several years ago.

Paulo Caroli: Another important point, don't forget to automate almost everything you have, including the panic button. The things go to production and you run your automated tests to verify that things are fine in production and in case it’s not, you got the rollback. The rollback should not be an all manual process or "Let me call the DB", it’s another automated function that you might start by hand and roll back.

10. Martin, a question about the ThoughtWorks book. There are a lot of quite nice techniques over there. How is the community taking these techniques and applying them? Are there companies using continuous integration, even unit tests and automating their deployment?

Martin Fowler: Yes. If we take the continuous deployment thing, I think that’s a particularly good thing to talk about. Here there are quite a few examples where we go in to a client and deployment takes months and usually finishes with this horrible weekend where you go in on Friday night, act panicked for the weekend, desperately trying to get your system up and the new version up by Monday morning.

It’s taking a few months to gradually bring in the kind of heavily automated procedures that we use and things of that kind, but after a few months we get it to the point where every couple of weeks we go into production, no one has to stay over the weekend and it’s a straightforward process.

That’s something that we are very confident about doing, but it’s something that’s evolved over the course of several years. It’s also something we are talking about more and more. You've seen the continuous delivery book that’s coming out shortly. We talk about techniques around this. That also was the driver behind Cruise Go - our commercial tool.

We began continuous integration back around 2000-2001 with Cruise Control and that focused on continuous integration part of the approach. But what we needed to do was to take it further, take it to the deployment stage. Now we see that the tools come in and look at the continuous integration that’s become quite a popular area.

Even with these tools, to actually get to where you can do continuous deployment, you need a different set of capabilities. That’s what drove us towards Cruise Go, because we have to build them ourselves for our clients and we thought we can make a product out of this.

Both the ideas through the Humble & Farley book and through tools and our own practices (because that’s what we do on our projects) we’re continuously improving that kind of cycle. That kind of thing you find on other topics you’ll find in the anthology as well, but that’s particularly visible.

Paulo Caroli: Talking about the anthology, there are two things. One which I thought is very interesting is it’s not a book of how do we think we should be good at software and which practices should we use, it’s about what has been used on the client side.

We believe it should be shared with people, that’s why a lot of interesting things come also there and they are real world stuff that we use. Another thing to notice is I think there is a new anthology book that would be coming out soon. Again, a bunch of ThoughtWorkers will write about their practice, we’ll put them together and another anthology book will come out.

Martin Fowler: It’s still in the early days. I don’t know how soon 'soon' is, but it’s in process.

Paulo Caroli: You know more about it.

11. I would like to thank you both, Paulo and Martin, and I hope to see you again often here in Brazil and I hope to see you also with ThoughtWorks coming back.

InfoQ Software Architects' Newsletter

Login with:

Don't have an InfoQ account?

Bio

About the conference

This content is in the Enterprise Architecture topic

Related Topics:

Sponsored Content

Related Editorial

Related Sponsored Content

Popular across InfoQ