Brenda uses artificial intelligence with machine learning to monitor the infrastructure, do quality assurance checks and support troubleshooting, handle alerts and communicate critical issues, and apply auto-healing.
Sree Rama Murthy Pakkala and Collin Mendons from Swisscom will talk about an AI/ML framework named Brenda, who helps their teams to increase quality at Swiss Testing Day 2020. This conference will be held online on August 26.
Brenda starts in the morning around 6AM CET with a health check of applications, followed by QA checks of basic business processes and sending reports and alerts. Then onwards it keeps monitoring and maintaining the test environment, performing QA checks, and sending reports every hour.
Support for test infrastructure is a complicated topic. Environments are unstable, as they are test environments. And you need extended hours support because teams are working globally. What if you do not have the resource capacity from a budget perspective?
According to Pakkala, Brenda provides a solution by sharing the status of the test environment transparently 24x7. It applies fixes for regular infrastructure issues. The team is able to spend time on innovative tasks rather than monotonous monitoring and restarting activities.
Brenda helped to create transparency of infrastructure availability and functionality in a continuous manner for stakeholders. According to Mendons, the first level of support was completely taken over by Brenda, which allowed them to better utilize the resources. Also, Brenda helped in reducing the MTTI and MTTR.
InfoQ interviewed Sree Rama Murthy Pakkala and Collin Mendons about applying artificial intelligence and machine learning for quality assurance, the technical aspects of and psychological connection to Brenda, what Brenda’s workday looks like, and what they learned from using AI and ML for quality assurance.
InfoQ: What made you decide to apply artificial intelligence and machine learning for quality assurance?
Sree Rama Murthy Pakkala: Initially, AI and ML were not part of the solution. Our support activity for the quality assurance team was more of a reactive approach. With the implementation of a monitoring solution, we moved from a reactive to a proactive approach.. When we decided to enhance our support from proactive to predictive, then the data collected from monitoring solutions lead us to machine learning and Artificial Intelligence.
So, it was the challenge and our solution which made us decide on technologies.
Collin Mendons: It was really a journey. In the past, the monitoring solutions we had were scripts that ran and sent email alerts about machine, application and component status. Later on, we moved to continuous monitoring where we were starting to have data on a time series database and monitor and create alerts. This was the moment that made us ask ourselves what if we used these data and started predicting, rather than just reacting to monitoring alerts? We used these data and applied ML in order to predict failures.
InfoQ: What is Brenda?
Pakkala: Brenda is an integrated monitoring framework with a Machine Learning capability using Artificial Intelligence.
Mendons: The birth of Brenda was an interesting story. There are two aspects to it. 1) The Technical aspect and 2) The psychological connection.
Technical:
As part of transformations, we built many monitoring solutions and automation frameworks to increase the quality of our software and infrastructure. All of these solutions were adding a lot of value. But we soon realized that these solutions were working as silos.
For example, when a monitoring system said a service was down, a QA automation was trying automation testing on scenarios that involved these services. Brenda was built to bridge this gap. Brenda is a framework that puts all these pieces (Automation, Monitoring, Troubleshooting, ML Models, etc.,) together, communicates with them, and makes decisions based on the status of various modules. This allowed the framework to deliver meaningful context out of it and create more value.
Psychological:
When the framework was created, we interviewed people who were using our various tools like Automation, Monitoring, Troubleshooting Scripts, ML Models, etc. We asked them how they connected the data from each module and made decisions. We analyzed what the working day of an employee using these tools looked like. The Framework was created exactly in the same manner to replicate the real behaviors of people using these tools. It was really a virtual team member.
So we really wanted to give it a name that people could connect to. Back then we had a chatbot named Bran, who helped testers analyze a defect by allowing them to simply chat with it. From there we decided on the name of a person, and named it Brenda. I strongly believe that naming is very important, as it really makes a direct connection to people. Today people know more about Brenda than the initial creators :-).
InfoQ: What does Brenda’s workday look like?
Mendons: Brenda starts with a health check of the systems through monitoring modules, and based on the status, starts performing QA checks of the important business processes. When there are issues with the QA Checks, she uses the service of troubleshooting scripts and tries auto healing to see if she can solve them and proceed further.
Once the QA is completed, a detailed report is sent to all stakeholders. When our testers are starting their day, they already have a transparent view of the availability of the basic functionality. From there she continuously starts handling the alerts coming from various modules and tries auto healing as a first step, and if the issue still persists, the issue is sent to a real person for deeper analysis and further action.
She repeats the QA checks in regular intervals, takes necessary action based on the result and involves the respective teams in case any further action is required.
InfoQ: What have you learned from using AI and ML for quality assurance?
Pakkala: AI and ML provide great advantages of both operational and financial benefits.
An important consideration is: we should have a clear idea of where and how AI and ML can be used.
As the whole implementation process is time and cost consuming, calculation of risk assessment and return on investment is required in advance during planning. First aspect is operational: how much we are minimizing the Mean Time To Repair (MTTR) and Mean Time to Identify (MTTI)? With this, we will have a better customer experience and satisfaction. The second aspect is financial: what is the difference in capacity consumption for the test environment monitoring and support with or without Brenda?
Mendons: ML and AI are today’s buzzwords. Not all modules of Brenda are ML and AI. The framework is a combination of traditional programming and ML/AL Models.
The most important part of ML/AI implementation is understanding the need, understanding the data, and picking up the right use case for ML/AI.
If the same results can be achieved with a simple program, go for it. Picking up a wrong use case for ML can become very expensive.