"Site Reliability Engineering - How Google Runs Production Systems" is an open window into Google's experience and expertise on running some of the largest IT systems in the world. The book describes the principles that underpin the Site Reliability Engineering discipline. It also details the key practices that allow Google to grow at breakneck speed without sacrificing performance or reliability.
Our efforts to improve software development face the question of what to focus on. Should we govern for predictability without concern of value, maximizing cost-efficiency without concern for end-to-end responsiveness? Or maybe do the opposite and govern for value over predictability, focus on responsiveness over cost efficiency? What we really need is to be predictably adaptable.
This article describes two novel practices for continuous delivery: Latent-to-live code pattern and Forward compatible interim versions. You can use these practices to simultaneously increase speed and reliability of software development and reduce risks. These practices are built on top of two other essential continuous delivery practices: trunk-based-development and feature toggles.
Agile, with cross functional teams, has sounded the death knell for many test managers. While test management is largely irrelevant in agile, there is still a desperate need for test leadership.
Enterprise security teams are charged with maintaining the “perfect” set of security policies. In their pursuit of the perfect security policy, they are often the department of slow. 1
By changing the inner workings from a project perspective to a product perspective Agfa Healthcare established a less complicated process using a single backlog for the entire organisation.
Little’s Law helps teams that use user stories for planning and tracking project execution, with a project buffer to manage inherent uncertainty of a fixed-bid project and protect its delivery date. 4
The book Conscious Agility (Conscious Capitalism + Business Agility = Antifragility) describes a design-thinking approach for business to benefit from uncertainty, disorder, and the unknown.
One of the largest areas of development waste are poorly formed requirements. This post presents a very simple technique that can be applied to all user stories to improve quality and reduce waste. 3
Product risk analysis (PRA) can be done during the various phases of sequential or agile system development. This article shows how to apply PRA to elevate it from project level to domain level.
This article discusses “human experience” testing and uses concepts from human computer interaction design theory to establish a framework for developing “human experience” test scenarios.
Chris Haddad explains in this article what Shadow IT is, what role it plays in the enterprise and why Enterprise IT needs to embrace it, adapt and address Shadow IT requirements, autonomy, and goals.