Facilitating the spread of knowledge and innovation in professional software development



Choose your language

InfoQ Homepage News Doing Safe-to-Fail Experiments

Doing Safe-to-Fail Experiments

This item in japanese


Safe-to-fail experiments can be used in complex environments to probe, sense, and respond. You have to know what success and failure look like and need to be able to dampen or amplify the effect of probing to handle potential failures. Safe-to-fail experiments can help you to deal with risks and uncertainty, learn, and keep your options open.

Liz Keogh, an independent Lean and Agile consultant, spoke about safe-to-fail at the European Testing Conference 2017. InfoQ is covering the conference with Q&As, summaries and articles.

Keogh started her talk with a short introduction of Cynefin. She stated that most IT initiatives are in the complex domain, where you can use a safe-to-fail approach, something she described in her blog post Cynefin for developers:

In a complex environment, you probe, sense and respond. You do something that can fail, safely, and it tells you things about the environment which you respond to, changing the environment. This is the land of high-feedback, risk and innovation.

In this domain, because the outcomes we look for keep changing, we can’t merely apply our expert practices and expect success. Instead, we have to change the practices we use based on what we learn. In this domain, we have emergent practices.

The Agile manifesto came into being because of this domain. We couldn’t get everything right up-front, so we started creating feedback loops within our process instead.

In the InfoQ summary experiment using behavior driven development, Keogh explained how you can measure complexity on a scale from 1 to 5:

5. Nobody has ever done it before

4. Someone outside the org. has done it before (probably a competitor)

3. Someone in the company has done it before

2. Someone in the team has done it before

1. We all know how to do it

You can map these complexity levels on Cynefin. Level 1 belongs to the obvious domain, level 2 and 3 to the complicated domain, and level 4 and 5 to the complex domain.

Keogh suggested that we should take the risky newest stuff first. This approach helps you to build trust with your stakeholders, as usually they worry about the risks and want to see them addressed. If your stakeholders don’t trust you, Keogh recommends delivering a nicely complicated 3 in order to gain their trust, instead of going for 4s or 5s.

A safe-to-fail probe has to have a way of knowing that it’s succeeding or failing. As you don’t know what will happen, you must be able to dampen or amplify the effect of probing, said Keogh. Safe-to-fail is not about avoiding failure completely, but you need to be able to handle potential failures.

Earlier InfoQ interviewed Tiago Garcez and asked him how a safe-to-fail experiment should look like:

(...) make sure before you start any initiative where there are considerable risk, that you use controlled experiments where you know what success and failure look like, so that you can evaluate potential solutions or ways forward. Such an approach keeps failure from being expensive or mission critical, while still providing opportunities to learn (if you structured the experiments in a coherent way).

In high uncertainty scenarios, provide coherence, not tests, said Keogh. Under such circumstances, "tests" can’t be certain; they’re examples of what might happen instead. But testers are still really good at coming up with those examples, and that mindset remains essential.

Keogh referred to the work on real options from Olav Maassen and Chris Matts, described in their book Commitment. Experimenting is a way to keep your options open, said Keogh. For instance, a rollback is an option that you can use when things go wrong.

Keogh also mentioned the Pachinsky Principles:

  • Seek out new ideas and try new things
  • When trying something new, do it on a scale where failure is survivable
  • Seek out feedback and learn from your mistakes as you go along

Keogh concluded her talk by giving two suggestions in looking for ways to make it safe to fail:

  • Use ritual dissent from Cognitive Edge, a technique to test and enhance ideas by challenging them
  • "Ask a tester", if you want to know if something is safe-to-fail. Testers are very good at this
We need your feedback

How might we improve InfoQ for you

Thank you for being an InfoQ reader.

Each year, we seek feedback from our readers to help us improve InfoQ. Would you mind spending 2 minutes to share your feedback in our short survey? Your feedback will directly help us continually evolve how we support you.

Take the Survey

Rate this Article


Hello stranger!

You need to Register an InfoQ account or or login to post comments. But there's so much more behind being registered.

Get the most out of the InfoQ experience.

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Community comments

  • Measuring complexity

    by Stephen Grey,

    Your message is awaiting moderation. Thank you for participating in the discussion.

    This material is useful apart from the complexity scale, which is facile. For instance, something might have been done before by your team but still be complex and so not behave the same way as the previous instance.

  • Re: Measuring complexity

    by Elizabeth Keogh,

    Your message is awaiting moderation. Thank you for participating in the discussion.

    Stephen, I'm using the term "complex" here in the sense in which it's defined in Cynefin. This is not synonymous with "complicated". The complicated domain is one which is the outcome is predictable, but requires expertise to achieve.

    If the team have done it before, then they can use their expertise to achieve the same predictable result.

    If there's something in the context that prevents them from getting the same predictable result, it's because there's something new there. The team haven't done it before (or at least, not in that context).

    Regardless of terminology, the risks and the value are still in the 4s and 5s, and identifying these has still been valuable to a number of teams at many different clients. The scale's also been adopted as one suggested way of profiling risk in Enterprise Services Planning, where it's used at a portfolio level to help align priorities with strategy.

    I suggest that if someone finds something valuable, and you don't understand why, please appraoch it with curiosity rather than labelling it "facile". I'm always happy to answer questions if I can, and to discover where and how I'm wrong.

    In this case, this is a model, and as George Box said, "All models are wrong, but some are useful." I hope this sheds some more light on how people are using it.

  • Re: Measuring complexity

    by Stephen Grey,

    Your message is awaiting moderation. Thank you for participating in the discussion.

    Sorry if the word facile was confronting but it seems to me that you are making the mistake you attribute to me. You say "If the team have done it before, then they can use their expertise to achieve the same predictable result." but this will only be more or less assured in the ordered domains. In the complex domain repeatability is not assured. What worked last time might not work next time.

    I think I do understand what you are talking about and the idea that complexity, in the sense used in the Cynefin framework, is part of a continuous scale that can be labelled 1-5 is misleading.

    Have you passed these ideas by David Snowden?

  • Re: Measuring complexity

    by Elizabeth Keogh,

    Your message is awaiting moderation. Thank you for participating in the discussion.

    Stephen, please bear in mind that this is a *summary* of a much longer talk, and that not all aspects can be captured in the very short paragraph that Ben's been able to include here. As a Cynefin expert you'll be aware of the shortcomings of codifying knowledge, and will not expect this non-expert summary to accurately convey every preferred perspective.

    For clarity, I'm not labelling complexity. I'm labelling stuff (usually Agile work items at a capability or epic level) to help people work out what's complex and requires probes (usually spikes or prototypes) rather than extensive analysis. The data precedes the framework, as always, and my original talk includes the words "guideline" and "approximation".

    Being able to achieve a predictable result is the *definition* of the ordered domains. If the team can't do that, they're in complexity. I think if we have any disagreement, it's perhaps only in whether the context in that situation counts as "new". Is that the point that you're finding misleading?

    You only asked one question, which I will answer: yes, my ideas have been passed by Dave Snowden.

  • Re: Measuring complexity

    by Stephen Grey,

    Your message is awaiting moderation. Thank you for participating in the discussion.


    Most of the article I thought was useful

    I find that ideas such as that scale are misleading and prevent people new to the field from internalising what complexity really means. It encourages a sense that it's just a bit more of the same on top of complication,a linear variation from an ordered setting, when it is actually a qualitative shift.

    The subject isn't well enough established yet for such misconceptions to be brushed aside by the weight of clear thinking among those who get it.

  • Re: Measuring complexity

    by Elizabeth Keogh,

    Your message is awaiting moderation. Thank you for participating in the discussion.

    Phrases like "the weight of clear thinking among those who get it", implying that I can't possibly be one of them, discourage dialogue.

    This is one of the reasons the subject is not well enough established.

  • Re: Measuring complexity

    by Stephen Grey,

    Your message is awaiting moderation. Thank you for participating in the discussion.

    This is getting out of hand.

    The use of "those of us" was intended to include you.

    I just think we (you included) can do a lot to promote understanding if we are careful about a few basic points and by exercising great care when trying to help others understand. I think the five point scale is unhelpful in this respect.

    I don't think there is much point taking this any further.

  • Re: Measuring complexity

    by Elizabeth Keogh,

    Your message is awaiting moderation. Thank you for participating in the discussion.

    Stephen, I decided to take some time out and think about what you're saying here before responding.

    This scale has worked well to help people adopt what I think you're calling a "qualitative shift"; behaving very differently towards those aspects of their work. Phrases like "learn by doing" and "safe-to-fail" and "experiment" are in common use at my clients (I am trying to move them hard away from "hypothesis"!)

    I have a *lot* of stories I can tell. So I think I am defensive over this because I don't want to give up something which does actually work and which I find useful. I apologise for that defensiveness, and for escalating a mild conflict that could have become a useful discussion.

    I think the reason this works is because I'm not reducing Cynefin to a continuum. I'm taking something which is perceived as continuous - the sharing of knowledge and the gaining of expertise - and showing people why it isn't.

    I think this is why it lands when I do it; because that scale is something that's already familiar to the people on the ground (it seems to land in other knowledge work too, not just IT) and I can then relate the difference between the 5s/4s and everything else to their lived experiences.

    I am very careful to call out the differences in the 4s and 5s, being the complex elements in that scale. I tend to colour them in red, amongst other things. Ben's captured a bit of that emphasis, but perhaps I need to be clearer about the importance of that.

    I'm just a practitioner, not an expert. I'm always amazed when I talk to true experts in Cynefin about *why* things work in the way they do. Hopefully when we meet we can have a better discussion about that. I'm prepared to believe that there might be something in my common contexts (and InfoQ's) which enables this to work. I welcome advice on how to emphasize the important aspects of the scale better. I'd love to find out more about the social and anthropological patterns behind things like this, so that I can spot more patterns in the future.

    If we find ourselves on the same side of the world, I would love to meet in person and have that discussion in a more forgiving environment. Otherwise, please feel free to connect with me over LinkedIn and we can discuss there or over email.

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p


Is your profile up-to-date? Please take a moment to review and update.

Note: If updating/changing your email, a validation request will be sent

Company name:
Company role:
Company size:
You will be sent an email to validate the new email address. This pop-up will close itself in a few moments.