Experiences from Testing Stochastic Data Science Models

A data science model is a statistical black box; testing it requires an understanding of mathematical techniques like algorithms, randomness, and statistics. To validate data science models you can use thresholds to handle output variance.

Laveena Ramchandani, a senior consultant, shared her experience from testing data science models at Agile Testing Days 2020.

Data science is the analysis and study of data. A data scientist is responsible for making decisions that benefit companies, Ramchandani explained. Data science allows visualizing and analyzing patterns in the data and uses visualization techniques to draw graphs that underline the analytical procedures.

As a tester, all the skills you have acquired in past projects will always come in handy for data science model testing too. One thing that definitely will come quite handy is the curiosity of how the model functions, especially when it is stochastic (random), Ramchandani mentioned. Also, sometimes mathematics are involved when trying to validate results. Therefore, some knowledge around statistics would be useful, she said.

Ramchandani mentioned that validating your results at the end of a run is what makes testing a data science model completely different from other types of testing:

I noticed that the data model I work with provides insightful information to its clients, but in some areas, it is stochastic(random). This is what makes it special too; the results can deviate a little bit but having a threshold in place like "we can accept changes between 3-5%" is something new that I learned. It was crucial to also learn from a data scientist how I know my results are accurate or not.

Working alongside data scientists on how to test a new feature was quite interesting, as Ramchandani mentioned:

It is helpful to pair up with data scientists and developers as they help you understand what the new implementation is and how we can validate it.

InfoQ interviewed Laveena Ramchandani about testing data science models.

InfoQ: What is a data science model? How do they look?

Laveena Ramchandani: Like any other code really, lines of code. It has its speciality in terms of the algorithms being used to allow the model to perform- algorithms such as genetic algorithms and predictive analytics. The main speciality in the model I work with is the randomness generated for some results in certain stages.

InfoQ: What makes testing a model different from testing, for instance, a software product?

Ramchandani: It’s important to understand your data set, for example, what is it going to be used for? How am I testing it? Do I understand it fully? Can I use a golden data set or do I have to use a client data set which is anonymised? These are some questions that help understand the data a little bit more.

Put yourself in the user’s position who will be using this one day and look at the model from that aspect too (user testing).

Each client will provide different configurations to help them understand how their business could benefit, therefore having a good understanding of the configurations was quite useful too.

A model configuration is something that I found quite new; there were plenty of configurations to set before running a model. It was vital for me to make sure I understood these before proceeding, as they will help me understand my results.

InfoQ: What does your testing process look like?

Ramchandani: My testing process is very similar to other testing processes. I am part of an agile team and I test any model related features within a sprint. The tests can be either manual or checking API endpoints to see if the front end application is working as intended or not. The majority of the time I have performed manual checks. The next step would be to automate the process and see how the model performs then.

Something else I learned was, it is a good idea to have a baseline run once a model is created, and then once you do new releases you can compare your new run to the baseline run and gain confidence going forward.

InfoQ: How do you ensure the quality of testing when testing a model?

Ramchandani: We can ensure the quality of testing by:

Making sure we have enough information about a new client requirement and the team understands it

Validating results including results which are stochastic

Making sure the results make sense

Making sure the product does not break

Making sure no repetitive bugs are found, in other words, a bug has been fixed properly

Making sure to pair up with developers and data scientists to understand a feature better

If you have a front end dashboard showcasing your results, making sure the details all make sense and have done some accessibility testing on it too

Testing the performance of the runs as well, if they take longer due to certain configurations or not

InfoQ: How can testers validate the results from stochastic models?

Ramchandani: As mentioned above, I learned that having thresholds was a good option for a model that delivers results to optimise a client’s requests. If a model is stochastic then that means certain parts will have results which may look wrong, but they are actually not. For instance, 5 + 3 = 8 for all of us but the model may output 5.0003, which is not wrong, but with a stochastic model what was useful was adding thresholds of what we could and couldn’t accept. I would definitely recommend trying to add thresholds; if your model is providing important insights around fraud detection, for instance, then your threshold should be even lower, around 0.5-1% acceptance of deviation in results.

Make sure you also understand your data set, as well as all the configurations that are being inserted to make the model run and provide optimised results.

InfoQ: What have you learned from testing models?

Ramchandani: I have learned a lot about how results can be stochastic and how we can accept results within a certain threshold.

Also, configurations to run a model are something new I had learned, so it was vital to understand what each of them meant to the model and client. Each client would require different information for each configuration. It was useful to do some exploratory testing on the configurations and see what the model was providing me with.

InfoQ Software Architects' Newsletter

Login with:

Don't have an InfoQ account?

InfoQ Article Contest

Rate this Article

This content is in the Culture & Methods topic

Related Topics:

Related Editorial

Related Sponsored Content

Popular across InfoQ

The InfoQ Newsletter