Jenkins Creator Launches ML Startup in Continuous Risk-Based Testing

Kohsuke Kawaguchi, Jenkins creator and former CTO of CloudBees, recently blogged about his decision to step down as CTO and co-found Launchable, a startup focused on using machine learning to optimise testing and continuous delivery. Kawaguchi explained his drive to use machine learning to help provide quantifiable indicators which help developers perform the right risk-based testing and have clear visibility of the quality and impact of the software they are shipping. In a recent article titled "Why Test Automation Is Not Continuous Testing," Wayne Ariola, a manager for Robotic Process Automation at Tricentis, recently wrote about the pressure which continuous delivery has put on testers to make faster risk-based go-live decisions.

Kawaguchi explained that he’d observed that many disciplines had surrendered "gut and instinct driven decisions" for "quantifiable metrics and models, which enables a whole new angle of process/efficiency improvements." Contrasting this with the practice of software development, he writes that decisions about what to test and release are still largely driven by "individual experiences, beliefs, and instincts." Kawaguchi explains Launchable’s main proposition as:

We think we can predict with meaningful accuracy what tests are more likely to catch a regression given what has changed, and that translates to faster feedback to developers.

Ariola wrote that DevOps practices have resulted in teams having delivery cadances of "anywhere from every two weeks to thousands of times a day." He shared that "software testers are facing increasingly complex applications, they are expected to deliver trustworthy go/no-go decisions at the new speed of modern business." Ariola explained that to meet this need we use "continuous testing" in CI pipelines targeted at providing "feedback on the business risks associated with a software release as rapidly as possible."

Launchable is currently inviting applications to join its public beta. According to the Launchable website, their solution can identify the surface of tests which provide sufficient confidence, based on the specific risks of changes made in the software.The site states that this is made possible by:

(a) machine learning engine that predicts the likelihood of a failure for each test case given a change in the source code. This allows you to run only the meaningful subset of tests, in the order that minimizes the feedback delay.

In his blog, Kawaguchi explained this further and wrote about a hypothetical scenario, where he asked the reader to consider a long running test suite. He proposed that the time to feedback could be greatly reduced if machine learning could be used to "choose the right 10% of the tests that give you 80% confidence."

Ariola described successful continuous testing as an activity which is targeted at "business risk," rather than requirements verification alone. He provided examples of how increasing levels of business agility and automation allowed companies to create a range of "competitive differentiators" in their products. He warned that this "also increases the number, variety and complexity of potential failure points."

Ariola explained that understanding business risk and the value of tests requires an understanding of their relationship. He wrote that one way to handle this is to "map risks to application components and requirements", which in turn "are then mapped to tests." Ariola also wrote of the need to strike a balance between accounting for risk and maintaining test cases. He wrote:

Use a test suite that achieves the highest possible risk coverage with the least amount of test cases.

Ariola critiqued a focus on test automation without seeing it as part of a continuous testing approach, writing that many teams cannot create "realistic tests fast enough or frequently enough." He wrote that such tests were often unable to identify if a "release candidate is too risky to proceed through the delivery pipeline." According to Ariola, automated tests also struggled to keep up with a constantly changing code base. He wrote that:

The constant application change results in overwhelming amounts of false positives and requires a seemingly never-ending amount of test maintenance.

Launchable’s website explains their ML engine is trained with "refined" information from "git repositories and test results from CI systems." It can then make predictions based on code changes. According to Launchable, these predictions are then used to create pull requests as soon as the required level of confidence has been obtained in your CI builds, allowing faster reviews and triggering of deployment pipelines.

In line with this, Ariola described the need for a sufficiently broad range of tests targeting areas of risk. He wrote:

...if the business wants to minimize the risk of faulty software reaching an end-user, you need some way to achieve the necessary level of risk coverage and testing breadth — fast.

Ariola also called for manual exploratory testing to find "usability issues" and "user-experience issues that are beyond the scope of automated testing." He warned that knowing if a "unit test failed or a UI test passed" will not warn you of a reputational risk which may arise from impacting usability:

To protect the end-user experience, run tests that are broad enough to detect when an application change inadvertently impacts functionality which users have come to rely on.

Talking about projects where developers select a subset of tests for faster feedback, Kawaguchi wrote that "people rely on arbitrary subsetting of tests," such as "smoke tests," but that "those tests are chosen completely by instinct." Launchable described its ML solution as being able to provide a "meaningful subset of tests, in the order that minimizes the feedback delay."

Launchable’s site is currently open to applications for participation in their forth-coming beta launch.

InfoQ Software Architects' Newsletter

Write for InfoQ

Rate this Article

This content is in the Culture & Methods topic

Related Topics:

Related Editorial

Related Sponsors

Popular across InfoQ

The InfoQ Newsletter