Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ


Choose your language

InfoQ Homepage News Testing Systems with a Nest of Tests

Testing Systems with a Nest of Tests

James Lyndsay did a workshop titled "a nest of tests" at the Agile Testing Days 2015. In this workshop he explored how you can design large collections of tiny tests and visualize their output to test systems, and showed how tools can help you to do it. InfoQ interviewed him about this testing approach.

InfoQ: Can you explain what you mean with using a "nest of tests".

Lyndsay: Bird nests are built from hundreds of insubstantial scraps – individually inadequate, but together able to protect a growing family. As testers, we can take measurements that are dull or trivial if considered individually, but which can support us as we try to discover deep truths about the systems we’re helping to build. And it rhymes, which is pleasant.

To see what I mean, go have a look at Mike Bostock’s "Will it Shuffle". On that page, Bostock shows the flaws in a shuffle algorithm by building a picture of 1800 measurements, which themselves show a summary of 10,000 separate runs through that algorithm. In his picture, Bostock has arranged for colour to show bias in the shuffle, so the more colourful and organised the picture, the more flawed the algorithm. And because the algorithm uses a built-in function which varies by browser, the same code shows different emergent properties, and hence different flaws, on different browsers. One shuffle in isolation wouldn’t tell us anything about this, but 10000 do. Choosing and building the right picture makes it easy to notice these differences.

InfoQ: What do you consider to be the power of visualisation in testing?

Lyndsay: Visualisation puts the results of many measurements in a form which suits the marvellous visual processing of our brains, allowing the information in the data to influence the models we make with our minds, and the understanding we develop in our teams. A picture is worth a thousand data points.

InfoQ: In your workshop you asked the audience which tools that they use to visualise data. Which tools do you hear that are used most often? Which ones do you recommend?

Lyndsay: Excel. It’s the universal screwdriver for testers. It draws great graphs, and can process, filter and sort the data. However, it’s inconvenient to switch from column to column by name, which can make it painful to explore data compared to Raw. I think we may find big-data tools like Splunk and Kibana work well for analysing test results, and I’ll enjoy playing with both later this year. But graph-drawing tools are a link in a chain, so one needs tool help with generating and applying data, too.

InfoQ: You mentioned in your workshop that you often pick a scatter plot to visualize data. Can you explain how you can do this?

Lyndsay: I use a tool, DataGraph from Visual Data Tools. It allows me to take a table of measurements from many tests, makes it easy to plot any column against any other – then lets me filter, colour and resize elements to reveal relationships between measurements. DataGraph costs real money and runs only on OS X, but there’s a fine open-source tool – Raw by DensityDesign – which runs in the browser, and has a great scatter plot.

InfoQ: For which situations do you recommend testing with many automatically generated tests?

Lyndsay: I’ve found interesting problems swiftly when driving components during exploratory testing. For instance, the code or the unit tests might lead me to an algorithm which takes a range in one or more of its inputs. If that algorithm is checked at only a few isolated points, I’ll generate detailed or broad ranges across combinations of variables, and use graphs to reveal points of interest.

I’ve also had great results when looking for surprises in integration testing, particularly when component parts are different technologies, or when a simple solution is built from powerful parts.

There’s nothing particularly new here, and there are clear parallels with performance testing and fuzzing.

InfoQ: When would you not recommend it?

Lyndsay: It’s limited to those parts of a system which can be driven hard, easily. You need to be handy with your tools, and any sense of completeness should be treated with suspicion – though it seems broad, it’s typically deep. It’s not an efficient way to verify behaviour, however many times a measurement matches expectations.

You’ll sometimes frustrate your colleagues who wrote the code. But sometimes we need to be mean to the systems we build – which are much more than simply code – to find the truth.

InfoQ: What are the benefits of this test approach?

Lyndsay: It’s a quick and cheap way to reveal a range of system behaviours – and if those behaviours surprise us, we have the chance of getting closer to understanding the real nature, and risks, of what we’ve made.

Rate this Article