Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ


Choose your language

InfoQ Homepage Interviews Matthew Skelton on DevOps and Continuous Delivery Challenges

Matthew Skelton on DevOps and Continuous Delivery Challenges


1. Hello. I'm Manuel Pais and I am here at QCon London 2015 with Matthew Skelton who specializes in helping organizations adopt good practices in continuous delivery and DevOps. Thanks for accepting our invitation. As a first question, can you briefly introduce yourself to our readers?

Yup. I've been involved in building commercial software systems for quite a while since about 1998, mostly on the software development side but more recently I've been involved in building deployment and infrastructure automation and things like this. And over the last year or so I've increasingly been looking at how teams interact with technology and how that can help or hinder continuous delivery and DevOps.


2. Your talk here at QCon focused on tools and collaboration for continuous delivery and DevOps. How do you approach both aspects when you go in an organization to help them with their initiatives in this field? Do you try first to improve their culture if it's needed or look at the tools first, how do you go about?

It's a good question. In fact, in reality it's usually a combination of trying to change the culture or collaboration and trying to change the tools at the same time. Some tools can help to kick start a new way of thinking. If you choose the tool in the right way and use it in the right way, then actually that can be a real enabler for some sort of cultural change. At the same time, if you choose the wrong sort of tool that can really drive teams apart. So yes, it's a combination of the two together.


3. And in your experience what are the main challenges your clients have towards adopting DevOps and continuous delivery?

These challenges have lots of different levels in the organization. So at a sort of middle management level, quite a lot of organizations don't really realize, particularly medium size and larger, haven’t really realized the scale of the changes that they're going to need to make. So they haven't really budged for it. In their head it's just a few little tweaks. It might just be: just put Jenkins as a deployment pipeline tool and they think that's about it but actually the challenge is much bigger for them.

Higher up in the organization, if you like, or in a different department, let's say in commercial or the finance department, the way in which the software and systems are funded within an organization can have a really negative impact. Particularly project-based funding for very short durations of time and a delivery date which has to be met. This kind of thing can really cripple the capacity for making DevOps or continuous delivery happen. So there are some challenges there too.

It does depend on the size of the organizations, smaller organizations generally as you would kind of expect, some of these problems are fewer or smaller in size. But even in organizations of only sort of 200 or so people, which is fairly small, they can still have acquired kind of ways of working which really hurt that stuff. So the conversation you need to have actually with a quite wide range of people from developers and operations people to architects, managers, but even speaking to commercial people and finance and this kind of thing as well.


4. And in particular for enterprises with long running software systems, I imagine there are specific difficulties they have. And what kind of migration paths would you recommend for those cases?

It's a good question. So if you take a fairly extreme example like an organization that maybe runs its e-commerce system on a mainframe. So some people would say, oh, you can't do anything with that because it's a mainframe and it’s really old technology and you may as well forget it. But in actual fact, what we’ve seen is that in these organizations there's still a huge amount of scope for improving their practices and processes and testing. Cooperation between operations and development in this kind of places often is almost nonexistent.

So actually, there's a huge amount of improvement that can be done even if you are running on a big old mainframe. Yes, there will be certain things you can’t do but I think that organizations should not think that they're special. They shouldn't think that because they have a particular piece of legacy technology that they can't do anything, there will be a huge amount that they can still do. If that becomes a bottleneck, then maybe they’ll need to change it but I’d imagine that would be some way down the line for lots of these organizations.

Manuel: So a lot of culture issues that can addressed.

Cultural issues but also technical practices, just getting continuous integration much better than it has been, having a really sensible approach to testing. So Amy Phillips talked just now about their approach to the way they changed testing at SongKick to adopt continuous delivery and that's actually quite radical, the changes they had to make. So there are lots of good technical practices that can really be sharpened up in these organizations even if they can't use some of the latest and greatest technologies.


5. Speaking of continuous delivery, do you think the concepts and practices have matured enough and are widespread and understood today? In other words has it become a commodity like continuous integration or are we still a bit far from that?

I definitely do not think it's a commodity. I don't even think CI, continuous integration, is a commodity either. I think what you see is lots of tools out there which will offer a continuous integration kind of server or service and that's fine, but actually continuous integration is a combination of tooling and a discipline. So you can't just get Jenkins or buy another CI tool and say we're doing CI. Actually, you can still have a tool in place and not really be integrating continuously because you got multiple branches. So I think we're a long way away from CI even or CD being a commodity. What we are seeing though is that awareness of these practices is definitely hugely increasing which is great to see.


6. But in the case that you have, let's say, the practices in place. On the tooling side, would you say that for continuous delivery you really need a dedicated tool or it's enough to have a CI tool with some plugins and orchestration? So purely from a technical point of view, is it essential to have a continuous delivery tool?

I don’t think it's essential. I think the crucial thing there is that people understand the capabilities of some of these tools and realize that continuous delivery can really benefit from having really strong visualization and kind of role-based security as part of the deployment pipeline. So if you think about CI and some of the oldest CI servers like Jenkins and so on, they do their job really, really well but they were never originally designed to kind of represent the state of a deployment pipeline.

So there are certain things which at least until recently some of them didn’t do so well, but that might work for an organization. If that works for them, that's great. But I'd definitely recommend organizations looking in this space to just kind of have look and see what some of the tools that have more dedicated continuous delivery features, see what they offer because that can be a real enabler for teams in some bigger organizations.


7. [...] What is your view on the impact of these new trends?

Manuel's full question: Now talking about microservices and containers, and particularly Docker, they are all the hype these days. What's the impact do you think in terms of both continuous delivery? For example, this might cause more pipelines and more fragmentation of the system so it's harder to have a global picture of the status of the system and it's harder to test. And the other part of the question, what's the impact on DevOps as, for example, microservices require more automation to deploy services independently, better provisioning, monitoring, all these kinds of things? So what is your view on the impact of these new trends?

One of the impacts of microservices is going to be the increased need for skills and capabilities within teams of treating monitoring and metrics as a really first class thing because without the ability to interrogate services or to discover new end points, this kind of thing, teams will get to the point where they don't really know what's going on. There's a danger there. So yeah, an increased focus on monitoring and metrics is going to be essential to make that stuff work.

I think the additional piece which not many people are really talking about now is what happens to data and with data stores sort of fragmenting with microservices. I'm not saying microservices are wrong. They're an optimization for deployability which is fine, making changes quickly. But there's a danger I think that we'll see data fragmentation. And in the past data architects or people involved in data structure, they have been able to rely on all of the data being in a single database, using relational database which are easy to query, and now we're moving towards a world where the data is more distributed. I think that's going to be a real challenge for some companies. I don't think we've seen the patterns of microservices for long enough to have any problems in production. I don’t think anyone has seen those kinds of problems here.


8. That actually brings me to the next question which was the fact that, as you said, microservices might require to have more robust or more complex data architectures and more fragmentation, as you said. Do you have any recommendations to deal with that, how would you go about doing that?

I have not seen any tools really in this space yet, certainly not outside enterprise tools. At a simple level, I think organizations should be aware of that as a problem, that data fragmentation is a problem. Make sure someone actually has a handle on the data strategy so they know the kind of things they need to think about. In the worst case, you have two teams building different parts of the system and both start to track customer ID, for example, in two separate databases, two separate data stores, so which is now the definitive list of customers for our organization? Does it matter? It might not matter. But if it matters, then there's the potential, that kind of conceptual integrity problem for the data there. That's something I am quite interested in this area.


9. And speaking of another topic which is ITIL, you help clients also with their ITIL requirements, right? How do you go about eliminating the potential barriers between lightweight Agile and DevOps practices and at least apparently heavyweight document-driven ITIL practices?

Yes, I think it's important to point out that ITIL is not fundamentally opposed to Agile and DevOps and that's a really crucial thing. The way in which it is implemented in many organizations has turned into something that works against it. What we have done is found people within the organizations where ITIL or similar service management frameworks are in place, find the people who really get it, if you like, who are open to being more flexible to do things more effectively and working with them directly to engage them.

An obvious thing is automating the standard changes. I think Steve Thair this morning in his talk from DevOps Guys talked about the fact you could almost characterize DevOps as ITIL but with 95% of standard changes being automated which is a pretty good definition as far as I can say. Because a lot of the stuff in ITIL is absolutely on the money with DevOps. The core of ITIL is the concept of continual service improvement rather than shipping products into production and forgetting about them. What is baked into ITIL is the concept that we're going to improve the service over time. That's what we're talking about with the DevOps approach as well.

It's not easy because of the historical baggage that ITIL comes with. Part of it is setting aside some of the terminology. Part of the solution I think as well is a little bit of increasing awareness within, say, the development teams, of what these different bits of ITIL terminology mean. Breaking them down or translating them, if you like, and say actually this makes a lot of sense. A lot of this is just common sense. So rather than it being a crazy process that's really impenetrable, just breaking it down into simple language and explaining what the stuff actually means, what the original purpose behind ITIL was.


10. That's a great answer. In your talk you also mentioned Conway's law and team topologies. Could you briefly explain different topologies you've encountered and the implications in those organizations

I have been looking for and collecting different kinds of team topologies, different configurations of teams within organizations for I think about two and a half years now and I've been collaborating with a few people online. We're pulling together a little catalog of the ways in which different teams are organized. Some patterns, if you like, and some anti-patterns too. The two main ways in which DevOps seems to work for organizations, you could characterize it as there's one where development and operations teams essentially collaborate quite closely on certain things but not necessarily on everything.

Actually, three main patterns. That's the first pattern where there's a little bit of overlap but still have some distinct Dev and Ops roles. There's another pattern where essentially each team has a full end-to-end capability including all of the monitoring and metrics and production support and everything like that, so developers on call and this kind of thing where the team shares a whole lot of responsibility together. And that seems to work for some organizations too. Certainly from what I can see, organizations which have quite a focused product and don't necessarily have lots of B2B type clients and that sort of thing. Netflix, for example, use that model and Facebook and so on and their product is relatively focused.

The other model is where you could call it infrastructure as a service where the work done by the people we traditionally call ops, things like stacking servers and plugging in network cards and doing backups and things, all that stuff is essentially outsourced to, say, Amazon or Azure or Rackspace or one of these organizations. Now, that infrastructure service model also seems to be where quite a lot of organizations are heading as well but with an internal capability as opposed to using Azure or Amazon.

They may be heading in that direction accidentally or deliberately. Accidentally because they've actually hired a lot of people with quite strong skills and it will be quite difficult for them to readjust to a more integrated, say, DevOps way of working. Or perhaps it's budget constraints or the way in which those teams are funded which is very, very different. So sort of the two or three main models are: there's some collaboration, a small degree or a large degree, but then another model which is actually the traditional ops or a lot of the ops stuff is sort of in a separate team where there's very little collaboration and that can be a good thing. Some organizations seem to be able to do really well with that.


11. And how could an organization assess if their current team topology is adequate for the kind of system architecture and the kind of deployment pipeline that they want to have as a target?

As far as the organization, it has to be really quite honest about the teams' capabilities and where the technology is going in time. So I suppose that the organization would actually need to have a sense of its own technological roadmap before it can start to assess that. There's quite a lot of organizations who don't really have that view. So that will be a good starting point to try and work out. Should they go to a public cloud? Should they go to a private cloud? Should they go to PaaS? Should they rebuild things? To what extent are they willing to invest in retraining people or giving people additional skills? For example, programming Ruby if they want to do some Chef between development and operations. And make a choice.

At one level it sort of doesn't matter which topology, which sort of mix of Dev and Ops teams together and sharing skills an organization goes for. But if they don't make a choice, then it's going to be quite painful because they will be tempted to do something which probably doesn't match. Conway's law starts to come into effect.


12. So being deliberate about the roadmap and also, as you mentioned before, the type of product or how many projects they develop? Okay. Recently you co-edited a book compiling DevOps and continuous delivery experience reports from industry called Build Quality In. What was your motivation or what surprised you the most in this process?

It's worth saying, without the input from my co-editor, Steve Smith, I don't think that book would have got published honestly. He really pushed a lot to get that one working so that's really great. We were both involved in Pipeline Conference last year and we realized that there are lots of people doing really good things in the DevOps and continuous delivery space. We wanted a way to kind of showcase that really, in a form that let them tell their story themselves. So it felt like the right thing to do to pull that together. We ended up with 20 authors in total and forwards by Dave Farley who wrote the Continuous Delivery book and Patrick Debois who is the grandfather of DevOps. It worked out really well.

What surprised me about it? It's been nice that we're now nearly I think 200 subscribers or 200 people have bought the book which is great. We haven't really done that much advertising yet so that's been nice. Reading through the different contributions from the authors in the book what's sort of surprising is that a lot of it just seems like common sense. And the extent to which a lot of what we're talking about with DevOps and continuous delivery at one level definitely seems like common sense. Now, that might be a kind of a hindsight bias where it seems obvious now and didn't before but it almost seems like a lot of the stuff that's happening that has been normal in the industry until now is so far from common sense. We're just getting back to a kind of sanity.

Manuel: So the feedback to the book so far has been positive?

Has been good, yeah.


13. The final question, I'd like to ask if you want to share any other pet projects you might be involved in? For example, still about the book, I know that you are donating 70% of the royalties to Code Club which is a volunteer led initiative to teach children to code after school. Do you want to expand on that?

It was nice to be able to support Code Club in the UK. They run coding clubs for kids 9 to 11 years old after school. All around the country now, I can't remember how many clubs there are. There's I think hundreds now in different parts of the UK. I think they're branching out into other parts of the world too. They do training for computer science curriculum for schools in the UK as well. It felt like a really good thing to support to help improve diversity within the industry. Just there's a big shortage of people coming into the industry who really understand what software is and how software systems work. So it felt like a really useful thing to do. We supported them through pipeline conference as well. We gave a donation last year and we’ll do the same this year.

So other things outside work, if you like. Well, I founded Pipeline Conference last year. That was the first one we did. We’ve got the next one in two weeks. So I'll be busy with that very soon. We're very pleased with how that's going. We've got Linda Rising who is an expert in Agile and organizational change, this kind of thing. She is giving our keynote. I'm also involved in London continuous delivery meet-up group. So we've got a thousand or so members now and we run an event pretty much every month. That seems to be quite good, quite useful. It's quite small as well, we have maybe sort of 70 to 80 people to each event, perhaps a little bit less. It still feels quite small and intimate. It doesn't feel really anonymously huge. So it's nice to be able to offer people the chance to speak on what they're doing.

I am writing another book on software operability with my colleague from Skelton Thatcher Consulting. We're doing that via LeanPub, the same publishing mechanism we used for Build Quality In. LeanPub is really awesome actually, I have to say it's a really, really good mechanism for publishing and the way in which you can release chapters at a time and people will get updated. So they buy the book at the beginning and when a new chapter comes out, they automatically get the new version. It feels really, really nice. It's really well done. So the book on software operability, the first version will be live probably by the end of April this year, 2015. And then we're hoping to have the book essentially complete by September. So that's pretty much the only thing I can afford to spend my time on outside work.

Manuel: Okay. Well, looking forward to read the book and thank you very much for taking part in this interview.


Apr 28, 2015