Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ


Choose your language

InfoQ Homepage Articles Reducing Verification Lead Time by 50% by Lowering Defect Slippage and Applying AI/ML Techniques

Reducing Verification Lead Time by 50% by Lowering Defect Slippage and Applying AI/ML Techniques

Key Takeaways

  • A clear strategy and roadmap is necessary to drive continuous improvements to increase the efficiency and effectiveness of software application integration and verification.
  • A quick win is to reduce defect slippage. Every defect prevented needs no effort to fix and saves regression test and (re)verification effort.
  • AI and ML techniques offer the potential to increase verification activities’ efficiency and effectiveness. Examples include conducting data analytics on the installed base data, automatic impact analysis, and defect classification.
  • Understanding how customers use products and generating typical customer workflows can effectively increase test coverage.
  • Algorithms to generate synthetic data offer the potential to use more realistic image data with test automation in comparison with the existing phantom data and will enhance the test coverage.

Many will recognize that verification timelines in development programs are often stretched. Keeping release dates as planned, last-minute changes and defect fixes require flexibility from the verification team to deliver on our commitments. That’s where we challenged ourselves with the following questions: Can we increase our flexibility? Can we increase our test coverage? Can we increase our efficiency? And is it possible to reduce our verification lead time by 50%? 

In the System Integration & Verification department of Philips Magnetic Resonance (MR) business, we are deploying our test strategy that should bring us there in two to three years. In this article, I’ll explain two important "pillars" of our strategy: shifting left and using state-of-the-art techniques to support our verification activities.

Challenges faced in integration and verification

The challenges that we face are linked to the complexity of our products, the growing number of configurations we need to cover during testing, parallel program execution, and the fact that we’re organized in different locations and are working with multiple internal and external suppliers. These challenges directly impact the verification plans, effort allocation, and lead time.

A challenge that’s impacting the lead time of verification execution is defect slippage, especially if we find defects in the late phases of our projects. This is an area where we can still improve. In the final stages of verification, we shouldn’t be "testing"…verification should be a "documentation effort" to prove our product meets its requirement specifications.

Verification is always at the end of development, where "pressure" to deliver gets high. To manage this "pressure," we must become smarter and more efficient in how we verify. Our vision is to reduce the verification lead time by 50% while at the same time moving away from a risk/impact-based test strategy and simply testing everything with every release!

The strategy to reach this vision focuses on the following pillars:

  • Shift Left - from defect detection toward defect prevention, shifting testing focus from verification toward requirements, development, and continuous integration.
  • World Class Team - build centers of excellence for the different verification domains and manage competencies/expertise across our global footprint, having integration and verification teams in Best (Netherlands), Bangalore and Pune (India), and Suzhou (China).
  • Test Coverage - restructuring requirements: splitting design from requirements and shifting requirements to the appropriate component/subsystem/product level. Improving test designs at the product and subsystem levels. Driving test coverage improvements via Operational Profiles/Workflows, i.e., testing closer to real customer use.
  • Test Automation - using state-of-the-art technologies to drive improvements in efficiency and test automation coverage.
  • Global Test Bay Footprint - optimize test configurations across sites to create/allow more flexibility in test execution.

Maybe the most important contributor to driving the efficiency and effectiveness of testing and verification activities is reducing Defect Slippage. Next to that, it’s an enabler to drive the overall productivity of R&D organizations. So I’ll elaborate more on this. 

Are you using and/or thinking of using Artificial Intelligence (AI) and Machine Learning (ML) to drive the efficiency and effectiveness of your verification processes? Maybe what we’re working on inspires you to explore similar areas.

Shift Left - from defect detection toward defect prevention

Our strategy here is around measuring defect slippage. With every defect in principle, there are three questions to ask:

  • In which development phase has the defect been found?
  • In which phase should the defect have been found?
  • What caused the defect to slip?

Once defects have been assessed for defect slippage, the next step is to look at the data. Which development phase shows the highest slippage? The slippage "heatmap" will clearly show this:

[Click on the image to view full-size]

By creating Pareto graphs, you can zoom into other details, such as which subsystems or functional areas show the highest slippage.

[Click on the image to view full-size]

It is unrealistic to expect we can address everything in one go. We select the top 2-3 areas with the highest slippage and, with a multi-disciplinary team, execute a deep dive to understand the true root causes. Examples of root causes are missing requirements, gaps or inconsistencies in requirement decomposition, missing test cases, gaps in impact assessment, etc.

Knowing the root causes allows us to identify and implement the required actions and improvements to prevent slippage from happening again for the targeted functional areas. With recent programs, we’ve been focusing more on software integration testing, and as such, we have seen the first-time pass rate of verification increase significantly. This resulted in more predictability in our verification timelines. The next step is to shift focus toward development and development testing.

Apply machine learning and artificial intelligence in testing

As mentioned earlier, verification is always at the end of programs, where pressure gets high. Our strategy will help to face the challenges that come with the "pressure," but I feel there are more opportunities; e.g., can Machine Learning (ML)/Artificial Intelligence (AI) help us to become smarter in how we test?

We’re currently exploring some areas where I believe ML and AI can help.

Data Analytics on Installed Base data

We have access to the log files from our systems in the Installed Base to improve our products’ quality and reliability. Understanding how our customers use our systems allows us to improve our test coverage by designing our test cases to get as close as possible to the real use of our products - information that can be extracted from our log files.

An important aspect of the customer workflow is scanning to produce images of the different anatomies. Our customers create their scan protocols for this, according to their preferences to optimize, e.g., contrast, resolution, speed of scanning, etc. The result is an enormous variety of scan protocols for the different anatomies used in the field. It’s impossible to create an input for verification activities and run it in an acceptable verification timeline. 

We asked ourselves: Is there a set of scan protocols used across the installed base that represent the typical use? With process mining and filtering techniques, we can generate the typical scan protocols for more than 160 anatomic regions in only a few minutes - something impossible to achieve manually. The result is a manageable set of scan protocols covering the different anatomies, with the best achievable coverage and as close as possible to real customer use. As a next step, we’re now looking into opportunities to extract the typical actions executed in combination with these scan protocols, like image post-processing steps, archiving, and/or printing data, etc.

Automated impact analysis of changes

We’re experimenting with code coverage measurements at the product level. If we can measure code coverage at the product level for the tests that we have automated, we can train an algorithm on traceability between tests and code. Then, if we change the code, the question is whether we can automatically identify the tests to run as part of the CI pipeline. In fact, we’re using AI to automate the impact analysis of changes.

What if we can include history data on test execution results (passed/failed tests) and defects? Can we then include the appropriate level of regression tests automatically? We’re currently exploring this idea - the proof of concept still needs to be planned.

Synthetic Data

Testing MR functionality needs image data. We’re mostly using phantoms for this. Phantoms are specially constructed objects to be used for imaging with e.g. MR scanners. Phantoms contain patterns and fluids that produce consistent and predictable results with scanning, By using phantoms, we miss variations in anatomies that we would encounter testing with real image data from scanning patients. For in-house verification and validation activities, we’re allowed to scan volunteers, but their availability, and therefore, image data is still limited. So, we’re exploring synthetic data as input for our tests.

[Click on the image to view full-size]

"Unlimited" data and the variation of data will increase test coverage. Synthetic data as input for automated testing allows us to increase the utilization of our test systems. Also, the impact of General Data Protection Regulations does not apply to synthetic data.

Defect classification

Another area where we’re starting to explore the opportunity of using AI is Defect Management. What if, based on the headline, the defect description, and attributes, we can automatically assign defects to the appropriate team for investigation and/or the solution? We can decide whether the defect is a duplicate of an earlier submitted defect or propose to reject it based on previous rejections.

For now, defects are all managed by the Defect Review Board (DRB). The DRB consists of key experts from different disciplines/functions who spend significant time reading and understanding defect content and deciding on the next steps. This is time that cand be significantly reduced in cases where defects can be assigned without DRB discussion just by reviewing and agreeing on the proposal made by the AI algorithm. The first results using training and test data from existing defect databases look promising. The challenges are changes in some of the attribute’s data with defects, e.g., due to changes in the organization/group names. Data validation and an enrichment step is needed before training the algorithm.

Benefits of improving efficiency

The most significant benefit we’re aiming for is increasing the predictability of the verification execution and reducing our verification lead time. As mentioned earlier, our goal is to reduce the verification lead time by 50%. This allows us to use our expertise more and more during design and integration, other than running verification tests. We started our journey but haven’t reached the finish line yet. 

An important enabler for predictable verification execution is the reduction of defect slippage. Every development phase where a defect has been missed increases the cost of repair. Finding defects in verification results in re-verification and additional regression tests based on impact assessment. Assume every defect takes, on average, 3-5 days of effort; for investigation, resolution, and verification, it’s easy to calculate what your effort savings will be. This is the effort that everyone prefers to spend on value creation.

What we have learned so far

Shifting left, based on measurements and analysis of defect slippage, ensures the right focus and continuation of defect slippage measurements will show improvements achieved. AI and ML techniques have already proven useful in test coverage closer to the actual customer usage of our products. The first feasibility results show a promising outlook for improving efficiency in other areas like impact analysis and (synthetic) data creation. Yet we just started and still need to learn along our journey.

We want to improve and work on many different topics, but we also have to accept that we cannot do everything at once. We need to pick our battles. Quick wins help to achieve results fast to drive change and earn the budget to continue investment in subsequent t improvements. And for doing so, having a vision and strategy and creating a roadmap and priorities is the way to drive continuous improvements. This will make the "breathing space" to ensure we continue delivering on our operational commitments.

What the future will bring for ML and AI in testing

With AI algorithms becoming increasingly integrated into our products, the quality and availability of adequate data, both for training and verification of the algorithms, becomes critical. In our situation, and I expect the same for other industries, I firmly believe synthetic data becomes an important enabler. As we work with patient data, General Data Privacy Regulations make it more and more difficult to work with real image data. From this perspective, synthetic data will help. 

I also believe the industry will adopt AI/ML technologies due to how we do verification. For example, unlocking the history of integration and verification data enables opportunities to automate the impact assessment of changes to define what needs to be tested. AI can also support automated planning of the appropriate level of (regression) testing. Analyzing and processing Installed Base data to understand how customers use our products are critical to improving test coverage. Design for detectability of how our customers use our products, as well as being able to connect to the installed base to retrieve this information, is vital.

About the Author

Rate this Article