Jim Shore Suggests Automated Acceptance Tests Are Not The Right Move

You're starting out in this cool new agile world. You burn through your first books, the classics of course, read up on the long-standing popular blog resources, maybe even glimmer knowledge and advice here on InfoQ. And the guidance tells you, among other things, that you must automate your tests - particularly your business-facing, "acceptance tests" so as to ensure confidence that requirements are understood and in fact met. Well, guess what, some of the same experts behind that mandate are now proposing the opposite: don't automate those tests afterall.

Leading the recent discussion behind this turn-around advice is James Shore, respected agile thought-leader and, ironically (or is it?), one-time project coordinator of Fit (Ward Cunningham's automated acceptance test framework that started it all).

Prompted by a conversation with Gojko Adzic, Shore posted his ideas about the "problems with [automated] acceptance testing", summarized as the following two points:

The real planned benefit of an automated acceptance tool, like Fit, was that the business folks ("customers") would write executable examples themselves. History has shown that very rarely occurs. In a few cases testers do, but in the majority of cases these tests are written by the developers.
These tests often become a real maintenance burden, as they are slow and brittle, and often hard to refactor. On this point, that end-to-end "integration tests" present a higher cost than they are worth, JB Rainsberger has a great series of articles explaining his rational why.

In a nutshell then, Shore (and Rainsberger indirectly) asserts that since the intended value (customers writing the tests) is not present, the high cost (maintenance) is not justified.

Wow, don't write automated acceptance tests? Seems like a real 180, radical thinking. Not surprisingly though, Brian Marick has also been saying similar things for some time now. Again ironic (or is it?), as Marick's 1998 paper talking about the possible merits of automating "busines-facing tests" was pioneering work at the forefront of the automated acceptance testing movement. Ten years later though, and a whole two years ago, Marick was saying this:

An application built with programmer TDD, whiteboard-style and example-heavy business-facing design, exploratory testing of its visible workings, and some small set of automated whole-system sanity tests will be cheaper to develop and no worse in quality than one that differs in having minimal exploratory testing, done through the GUI, plus a full set of business-facing TDD tests derived from the example-heavy design.

Adzic, the original receiver of Shore's message, concurs on the first point, but isn't fully convinced about the whole message of "don't automated them":

I never really expected customers to write anything themselves, but I was relatively successful in persuading them to participate in specification workshops that led to examples which were then converted to acceptance tests later...Clear examples and improved communication are the biggest benefits of the process, but using a tool brings some additional nice benefits as well. A tool gives us an impartial measure of progress. Ian Cooper said during the interview for my new book that “the tool keeps developers honest”, and I can certainly relate to that. With tests that are evaluated by an impartial tool, “done” is really “what everyone agreed on”, not “almost done with just a few things to fill in tomorrow”. I’m not sure whether an on-site review [as suggested in Shore's writeup] is enough to guard against this completely.

George Dinwiddie also agrees that there's been little success with business folk writing these tests, adding most testers to that category, but insists automation is still worth the cost of preventing regression defects:

As Elisabeth Hendrickson says, "If the customer has an expectation, they have expressed that expectation, they have every reason to believe you have already fulfilled that expectation, they don’t want to have to manually go re-verify that you have actually done the thing that you said you did before."

Is that so much to ask?
...
Given that I’m convinced things need to be retested, and that the shorter the iteration the more frequently they need to be retested, I’m not willing to give up on automated tests.
...
If we approach the development of the examples with the customer and in the terms of the customer, then we’ve accomplished the hardest part. It’s worth spending the extra effort to put these examples to work [by automating them] preventing defects rather than finding them after the fact.

Shore soon continued his commentary with an examination of the things he has his teams do to "eliminate defects" without automated acceptance tests. In this, Shore clarifies that he's not suggesting to just throw out automated acceptance testing, but that it must be replaced with something, and goes on to describe his view on that "something". In essence the approach Shore lays out equates to a good, rigorous application of modern extreme programming practices (nonetheless, the post is well worth a good hard read and bookmarking).

In reaction to both of Jim's posts, Ron Jeffries pitched in with his own long take on the whole discussion. Among many other points, Jeffries still, like Adzic and Dinwiddie, isn't convinced automation should be forgone:

Jim goes on to say that he’s OK if the tests are not automated and if they are not customer-understandable. I’m OK that they are not customer-understandable — though I would prefer that they were if it were close to free. I am less comfortable with the notion that they are not automated. My concern would be that if they are not automated, doors are opened to regressions.

It would be interesting to know when these tests are automated, and when they are not, and what other tests are commonly put in place when they are not. Certainly it is not necessary to run every example to be sure that the code works. Probably it is necessary to run some.
...
My conclusion is that certainly what Jim’s teams are doing is working, and they are doing all the XP practices quite well. If other teams do the practices that well, they’ll probably have similar results.

And I think that automated story tests are the simplest and most certain way to prevent defects cropping up in stories later on.

So, everyone re-emphasizes that getting business folks together with developers and having them talk through examples is still a must-do, whew. But regarding automating these examples, Shore, Rainsberger, and Marick say maybe not. Others argue yes.

An interesting debate indeed. What say you?

InfoQ Software Architects' Newsletter

Write for InfoQ

Rate this Article

This content is in the Agile topic

Related Topics:

Related Editorial

Related Sponsors

Popular across InfoQ

The InfoQ Newsletter