Key Takeaways
- The most popular agile framework, Scrum, predates the growth of DevOps. In consequence, the practices within Scrum (and other Agile frameworks) are overwhelmingly focused on what you might loosely define as the development aspects of software delivery, and less focused on the Operational aspects.
- A blended DevOps approach requires some re-thinking around teams, backlogs, how user stories are written, and so on. For example, a backlog should include scalability, deployability, monitoring, and so on.
- Sprint Planning should include some DevOps aspects so that you discuss not only product functionality, but operability features as well.
- A conventional Scrum Master may not fit well into this blended approach - the role is more that of an agile coach.
- We need to consider DevOps right from the moment we hire our team members, from the planning and building of our products right through to their ultimate retirement.
There’s no point building a super-cool, super-functional product that looks and feels awesome for the customer if you can’t deploy it, maintain it, and support it once it’s gone live.
In the Agile world, great efforts have been put into making sure we deliver what the customer expects, within reasonable budget, and on-time. We also go to great lengths to help our customers determine the highest priority features, so that we shift our focus towards delivering high business value. We deliver early and often to get regular and relevant feedback. We use “user stories” to help us think from an end-user’s perspective, and we test our code on every commit to make sure we’re not breaking our codebase.
This is great, but where are all the clever tricks and techniques designed to ensure we deliver deployable, scalable, performant products that can be updated in real-time, monitored from the very second they’re built, and managed from day-to-day without needing a team of support engineers?
Agile has borrowed (and continued to evolve) great ideas from the automotive industry, the neuroscience world, ancient philosophy, the military and mathematics, to name but a few (think Lean manufacturing, cognitive bias, servant leadership, planning and relative sizing). It’s now time to borrow some thinking from the DevOps scene to ensure Agile remains the most suitable and successful set of principles and practices for delivering products.
Most products spend the majority of their lives being supported and maintained after they’ve been launched (bug fixes, feature releases and enhancements for example). The practical way in which these are managed (rolling out changes to a “live” service, testing in “live-like” environments and so on) as well as how the product can scale for performance are seen as “Operational Features”, and are often nowhere to be found on the product backlog.
Gartner report from 2006 put the figure as high as 80%, whilst a more recent report in ZDNet cites a survey from consulting firm CEB which “found that 57 percent of the budget will go towards maintenance and mandatory compliance activities, down from 63 percent back in 2011.”
DevOps teaches us that Operational Features, or “Operability”, is actually a first-class citizen, and should be treated with as much regard as any other product feature. The best way to ensure this happens is to foster a strong culture of collaboration between development teams and operations. Quite how you achieve this collaboration is another question, and “DevOps” models can differ quite wildly, from the Amazon “You build it, you run it” approach, where both development and operational activities exist within a single product team, to the “DevOps as a platform” approach found in some Google teams.
The Need for DevOps
Agile and DevOps have lived side by side for a few years now, and there’s been plenty of discussions around the relationship between the two.
Some people see DevOps as a subset of Agile, others see DevOps as “agile done right”, others see DevOps as a set of practices around automation, loosely connected to the Agile big picture. It all depends on your definition of DevOps. But regardless of how you see DevOps, the intention of delivering working software which can be managed, maintained, scaled, supported and updated with ease, is something the software delivery world desperately needed.
The way we run and operate software has changed massively since the days when our agile frameworks were invented. Scrum started back in 1993, the XP book was launched in 1999 and DSDM launched in 1994. Back then we were writing MSI installers, burning them to disks and posting them to people!
Running, maintaining and operating software was generally not something most software developers were involved with in any way.
Since then a major shift has occurred to SaaS and PaaS and the production environment is at our fingertips. Developers are now actively involved in the operation and support of their systems, but we’re still following frameworks that don’t accommodate this change in the way we work.
Continuous Delivery to the Rescue, Almost.
Continuous Delivery requires deployment automation. This was a step in the right direction, even if it did, in some organisations, inadvertently create a spin-off profession of Continuous Delivery engineers (thus often creating another silo). Continuous Delivery engineers grew into Continuous Delivery Teams, and eventually “Platform Teams” as infrastructure management became increasingly central. In many cases, this “shifting out” of the Continuous Delivery aspect into a separate team seemed natural, and allowed many Agile teams to get back to what they felt most comfortable with - developing software, rather than delivering it.
Unfortunately, this fits the Scrum framework quite nicely – the Development Team focus on designing, developing and testing their software, the Continuous Delivery Team focus on managing the system that deploys it, and the underlying infrastructure.
The trouble with this approach of course, is that by separating the build and deployment automation work, along with the infrastructure management tasks, we’re essentially making it “somebody else’s problem” from the perspective of the agile team, and “Operability” once again disappears into the background.
Many teams did embrace Continuous Delivery “the right way” and that enabled them to adopt a “we build it, we run it” approach to software delivery (and with that a greater sense of ownership, and improved quality as a result), but the same cannot be said for everyone. Evidently there’s considerable resistance to adopting new practices into particular agile frameworks, even if those practices are themselves “agile” to the core. And now we’re seeing the same thing with DevOps.
DevOps Anti-Patterns
The main focus of DevOps is to bridge the gap between Dev and Ops, reducing painful handovers and increasing collaboration, so that things like “deployability”, scalability, monitoring and support aren’t simply treated as afterthoughts.
However, we’ve already started to see strong anti-patterns emerging on the DevOps scene, such as the separation between the Dev team and the DevOps team, effectively creating another silo and doing little to increase collaboration.
The problem is that there is very little information on how to actually blend your Agile development teams with this new DevOps approach, from a practical perspective.
What practices do we need to adopt? Which practices do we need to stop doing? How do we get started? What roles should we have in the team? These questions remain largely unanswered. As a result, teams are “bolting-on” DevOps rather than fully integrating it into their software development processes.
In this classic DevOps anti-pattern we have all the agile ceremonies happening, and many of the usual DevOps practices as well, but the end result is no better than before – Operability is still an afterthought and products are optimised for development rather than delivery and operation. This is all because the key DevOps practices are being “bolted-on” rather than baked-in from the start.
The solution, of course, is to bake these good DevOps practices in from the very beginning, by absorbing them in to our daily agile processes and practices – and this is what requires some tweaks to our agile frameworks.
Updating Agile Practices
So what can we do to ensure we’re developing software in an agile manner, while also delivering and maintaining our products and services in accordance with some of the latest and greatest DevOps best practices? Well, it’s easy – we just shift left!
Ok, that sounds a lot easier than it is in practice, but the concept is straightforward enough. We underline this by adding Operability tasks/stories to the backlog, alongside our user stories. Our backlog suddenly becomes a full set of epics, stories and tasks needed to get our product delivered successfully and then maintained once it’s gone live (as opposed to simply a set of functional features from an end-user’s perspective).
On the surface this might sound easy, but there are a couple of considerations, such as:
- Who’s going to work on these Operability stories/tasks?
- How can you write an Operability story if there’s no end-user?
- What are these so-called DevOps best practices?
- How can a Product Owner be expected to manage this?
The answer to all of these questions is: “things will need to change”
Teams:
Most Agile teams we work with don’t include ops, support or infrastructure specialists. You might argue that there’s insufficient demand for such specialisms to be in each and every agile team, and you might be right, but don’t forget people said that exact same thing about testers, and architects, and database engineers, and UX, and so on…
If that way you deliver, support, update, scale and maintain your product is important, then you need these skills in your team.
Is this going to mean that you have to break Jeff Bezos’ “2 Pizza Team” rule? Maybe. But if your share of pizza is that important to you, you could always skill up! ☺ (this isn’t actually as daunting as it sounds – the more we move towards an x-as-a-service world, the less hard-core sysadmin knowledge you need. Instead, we’ll all need a firm understanding of cloud functions and related services).
The Backlog:
If we have cross-functional teams, then we’re going to need cross-functional backlogs.
Leave the traditional view of a product backlog in the past – it’s time for a fresh approach which embraces the operability aspects of our services. And we use the term “services” intentionally, because what we tend to build these days are indeed services, not shrink-wrapped products. Services are products that need to be deployed, scaled, maintained, monitored and supported, and our backlog needs to reflect this.
Most scrum product backlogs we see contain something like 90% traditional features that can best be described as a collection of desirable features from an end-users perspective. The remaining 10% tend to be performance related or something to do with preparation (setting up dev environments, prepping databases, and so on). The weighting towards end-user functionality/product features is very revealing. I’m not sure if this is a consequence of the Scrum framework itself, or a result of end-user bias by Product Owners (or something else entirely).
Instead, a modern Service Backlog should describe (besides user functionality):
- The scalability of the product/service (up, down, in, out – and when)
- The deployability (does this need to be live deployed with no downtime?)
- Monitoring of the service (what aspects need monitoring? How do we update our monitoring with each change?)
- Logging (what information should be logged? Why? And in what style?)
- Alerting (who? When? How? Why?)
- The testability of the service
- Security and compliance aspects, such as encryption models, data protection, PCI compliance, data legislation, etc.
- Operational performance
Skelton says: “To avoid “building in legacy” from the start, we need to spend a good portion of the product budget (and team time) on operational aspects. As a rule of thumb, I have found that spending around 30% of product budget on operational aspects produces good results, leading to maintainable, deployable, diagnosable systems that continue to work for many years.”
It should be noted that these Operability and Security requirements are continually changing, and evolve with the product/service, so one cannot simply get them all done at the start of the release and then move on to the traditional product features. For example, it may not be cost-effective to implement an auto-scaling solution for your system until such a time as that system is commercially successful. Or perhaps you need to change your encryption model to conform to new security compliance. Equally, you may need to change your very deployment model when new geographic locations come on-line. Also, monitoring will usually need updating when any sizeable change to an application’s functionality takes place.
User Stories
User stories are a fantastic way of capturing product requirements from the perspective of the expected outcome. User stories have helped many developers (ourselves included) to think about problems from the end-user’s point of view, and focus on solutions to problems rather than simply following instructions. I’m referring of course to the way user stories focus on the “what” rather than the “how” (a good user story presents the problem, and leaves the solution up to the developers).
User stories are often written in the following format:
“As a…
I want…
So that…”
This forces us to write them from a user’s perspective (although not necessarily an end user).
However, over the years we have found that writing Operability stories using this format doesn’t really offer the same improvements. This may be because the “user perspective” bears no impact on the way the solution is technically implemented. Regardless, it feels a little redundant writing “As a sysadmin” or “As a developer” if you’re implementing the solution yourself.
It’s not particularly unusual to see “technical backlog items” written in a backlog without adopting the “As a… I want… So that…” format, and similarly I tend not to recommend the format for Operability features. I instead prefer to use a “What and Why” format, which simply lists WHAT needs doing, and WHY (to provide context).
Sprints:
Two week sprints feel about right for developing new features, testing and deploying, and then demonstrating them to stakeholders. Any longer and it would be hard to maintain focus, as well as pushing out the feedback loop to a slightly uncomfortable number. Any shorter, say one week, and suddenly meetings and other ceremonies take up an inordinate percentage of your actual sprint time, meaning the amount you can get done feels tiny. So two weeks feels right for many people. It’s just the right amount of time to get your head down and focus on what you’re committed to doing.
This is great if you’re developing a new product, but what if you’re iterating through some improvements or developing the next version of your product?
Who’s going to look after all the constant issues that arise from the production platform?
If you’re exposed to high degrees of interruptions such as these (or equally damagingly varying degrees of such interruptions) then you’ll be well aware that they can wreak havoc with your sprint commitments. Two week sprints seem to add a lot of value in terms of helping people to focus on a realistic target, but in an unpredictable environment it can be hard to determine exactly how much you can achieve from your backlog. Difficult, but not impossible.
If we measure the average amount of work we can burn through from our backlog, and the average amount of “interruptions” we get from the production platform, we can essentially deduce two velocities.
Your backlog velocity is the rate at which you can complete “planned” work from the product/service backlog, while the “unplanned” velocity is the capacity of work that hits the team during the sprint. Tracking these two velocities allows us to plan more effectively.
Kanban is of course another option, which can accommodate both planned and unplanned work, and is often the framework of choice for teams who have little insight into the future in terms of what they’ll be working on in a week’s time. It can also be highly effectively to deliver longer-term projects/releases, but requires high levels of discipline in ensuring that the backlog is continually and correctly prioritised.
Sprint Planning
If you’re doing sprints, then you’ll need to do sprint planning. To bring a DevOps perspective to your sprint planning, you need to do the following:
- Invite ops/infrastructure/support people to the planning session
- Discuss not just product functionality, but operability features as well
- Plan them into the upcoming sprint
- Take into consideration time & effort that will be consumed by “interruptions” – that is, unplanned work coming from the Production Platform, such as bug fixes, escalations, etc (this value is your “unplanned velocity” and effectively acts to reduce your backlog velocity. The higher your unplanned velocity, the lower your backlog velocity will be)
Definition of Done
A popular definition of done is “passed UAT”, which is basically another way of saying “the business has signed off the feature”. But this largely forgets about operability, security, performance, and so on. For a story to be considered “done” it needs to be ready to go live (or better yet, be in the live environment already). This means it needs to be scalable, performant, monitored, secure and obviously deployable! If your story doesn’t satisfy all of these, it’s not done.
Scrum Master
Bearing in mind that we’re going to need to bend or break some existing Scrum rules (see above for examples), the role of the Scrum Master is thrown into question. Even if you want to maintain a process that closely resembles scrum, the fact is, it isn’t scrum; it’s going to be a blend of Scrum and DevOps.
Certain aspects of the Scrum Master role are still perfectly valid – such as removing impediments, but the Scrum Master will now need to remove impediments not just to software development, but also to software delivery and maintenance.
Another option is to transition to an Agile Coach role, where the principles and values of Agile are adhered to, but in a way that’s sympathetic to your new processes, and not constrained by the prescriptive rules of the Scrum framework. The last thing you want is a Scrum Master who doesn’t appreciate the purpose of DevOps; that’s simply going to create an even bigger divide between the development and operations sides.
Product Owner
In our blended Agile and DevOps environment, our Product Owner needs to understand the importance of operability more than anyone.
In SaaS, PaaS and Serverless environments a lot of the value is hidden – it’s not in the front-end. The value is in how our services work. It could result in time-saving, money-saving, increased performance, reduced-risk, improved reliability or any other “hidden” value. Product Owners now need to get this, because ultimately they’re responsible for guiding the priorities.
Continuous Integration (CI) and Continuous Delivery (CD)
Some people recommend separating your CI and CD tooling, presumably because whilst CI is more dev-focused CD has a more holistic view.
Whichever way you look at it, CI and CD are more than just tools, they’re actual ways of working. There are big differences between having a CI system and doing Continuous Integration. The exact same can be said of Continuous Delivery.
In our DevOps/Agile blended environment it’s essential that we not only use CD as a delivery mechanism, but also as a guiding set of principles and practices. This matters because Continuous Delivery brings development and operations into the same frame. A good CD pipeline like the one show below will visualise all of the important steps in getting software delivered successfully and regularly – you can see with your own eyes how important it is to have available test infrastructure, reliable testing frameworks, good monitoring and deployment automation.
Remember the 8 Principles and 4 Practices of Continuous Delivery, as outlined by Dave Farley, paying particular attention to the key practices of “build binaries only once”, “use precisely the same mechanism to deploy to every environment” and “if anything fails, stop the line!”, but above all take heed of this message: “everybody has responsibility for the release process”.
Conclusion
The most popular agile framework, Scrum, was designed for a time when teams didn’t tend to worry about operational issues, such as scalability, deployability, monitoring and maintenance.
As a result, the practices within Scrum (and other Agile frameworks) are overwhelmingly focused on what you might loosely define as the development aspects of software delivery, and less focused on the Operational aspects.
DevOps helps to redress that imbalance, but has little influence over the practices that happen during the development phase itself. The lack of definition around DevOps and the lack of a prescriptive framework means there’s little or no information on how to bring DevOps thinking into your Agile software development processes.
To maximise the value of Agile and DevOps, you must start to implement some of the DevOps principles right at the beginning of your development process, because bolting on a bit of deployment automation at the end isn’t going to help you build more scalable, deployable and manageable solutions.
We need to consider DevOps right from the moment we hire our team members, through the planning and building of our products right through to their ultimate retirement.
This means we have to take a fresh look at some well-established concepts within Agile, such as the skillsets and roles within a Product Team, the Product Backlog itself, and how we plan and execute iterations.
Many teams have successfully adapted their Agile practices to become more DevOps aligned, but there’s no one-size-fits-all solution available, just a collection of good patterns.
About the Author
James Betteley is from a development and operations background, which is pretty handy for someone who now works in the DevOps domain! He’s spent the last few years neck-deep in the world of DevOps Transformation, helping a wide range of enterprise organizations use Agile and DevOps principles to deliver better software, faster.
Matthew Skelton has been building, deploying, and operating commercial software systems since 1998. Co-founder and Principal Consultant at Skelton Thatcher Consulting, he specialises in helping organisations to adopt and sustain good practices for building and operating software systems: Continuous Delivery, DevOps, aspects of ITIL, and software operability. Matthew curates the well-known DevOps team topologies patterns and is co-author of the books Database Lifecycle Management (Redgate) and Continuous Delivery with Windows and .NET (O’Reilly). @matthewpskelton