Protecting Artificial Intelligence from Itself

Applications using artificial intelligence can be fooled by adversarial examples, creating confusion in the model decisions. Input sanitization can help by filtering out improbable inputs before they are given to the model, argued Katharine Jarmul at Goto Berlin 2018. We need to start thinking of the models and the training data we put into them as potential security breaches, she said.

Katharine Jarmul, data scientist, O’Reilly author and co-founder of KIProtect, spoke about protecting artificial intelligence from itself at Goto Berlin 2018. InfoQ is covering this conference with Q&A, summaries, and articles. InfoQ interviewed Jarmul about fooling AI applications, creating robust and secure neural networks, and mitigating data privacy and ethical data risks.

InfoQ: Artificial intelligence seems to have become a hot thing in software development. Why is this?

Katharina Jarmul: I think artificial intelligence is definitely experiencing a renaissance of sorts after hobbling through the AI winter and benefiting from the growth of "Big Data" practices. It depends on who you ask how you define artificial intelligence. I would say the most interesting "AI" to me is the application of machine learning to real-world use cases and business interests, essentially using data + algorithms to help with predictions or automation. This is experiencing large growth since the early 2000s especially as datasets grew larger -- and is now available in some automated ways via various MLaaS companies (Machine Learning as a Service).

InfoQ: How is possible to fool an application that’s using artificial intelligence?

Jarmul: My talk at 34c3 about Deep Learning Blindspots went into this in great detail, but essentially -- if you have a deep learning-based model, it is susceptible to what is called adversarial examples. These examples "fool" the network into "seeing" something that isn’t there. I use quotes for "fool" and "see" because of course the model doesn’t have eyes or a brain, so it can’t be fooled and it can’t see in our understanding of those words. Instead, we essentially are increasing confusion in the decisions of the model -- meaning we are making it harder for the model to achieve the correct decision. This is done by adding specific noise and perturbations to the input which are aimed specifically at disturbing the model. One such example I often reference is the video of a 3D printed turtle that is labeled as a rifle from all angles (Video here: Synthesizing Robust Adversarial Examples: Adversarial Turtle)

InfoQ: What does it take to create a robust and secure neural network?

Jarmul: There is a lot of active research on how neural network-based models can be secured against adversarial examples, and to what degree those approaches will be successful. One of the most interesting and feasible approaches in my opinion is input sanitization. We can think of the model as accepting ANY input, no matter how impractical or impossible that input is. What adversarial examples often use is creating almost impossible inputs (a pixel that is bright orange in the middle of a darker patch) and using these to increase uncertainty or change the decision of the model. Approaches like input sanitization such as feature squeezing or other dimensionality reduction on the input *before* it goes into the model is perhaps the most practical and scalable approach when we think of adversarial examples across many different types of models.

That said, what I spoke about at GOTO was a step or two beyond just handling adversarial images or examples, because I think one of our primary concerns in machine learning is not adversarial examples -- but instead privacy and data security concerns. It is relatively easy to extract information from machine learning models and we are deploying more of them into production systems where they are touching the outside internet and are open to potential adversaries. When we train models using personal or sensitive data and then leave their APIs open to others -- I would compare this to opening your database to the internet. We need to start thinking of the models and the training data we put into them as potential security breaches. There has been active research regarding these extraction methods, including the ability to extract biometric data from trained models (Model Inversion Attacks that Exploit Confidence Information and Basic Countermeasures) and Professor Reza Shokri’s award winning paper on Membership Inference Attacks, which shows how you can determine if a data point was part of the training dataset with high accuracy. Protecting data fed into machine learning models is one of the things my company KIProtect is working on -- namely, how we can make security and privacy easier for data science and machine learning.

InfoQ: How can machine learning place data privacy and ethical data use at risk, and what can be done to mitigate such risks?

Jarmul: When we put private or sensitive data into machine learning models, we are asking the model to learn some of that data and use it for later decisions. Elements of that data will essentially be stored in the model -- meaning these can be extracted as they would from an embedded document. This information exposure can then be exploited by an adversary to learn about the training data or model decision process -- exposing either the private data or sensitive logic to anyone who can access the model or model API. So, as an individual, this means that if my personal information or data is being used to create a model, especially one that retains larger pieces of information (like certain neural networks), this can be used to then extract information about me after the model is created.

Because more companies are using machine learning and MLaaS, I think we, as consumers, should be concerned about potential privacy and security risks of having our personal data or data about us and our behaviour in models which are publicly available. As machine learning practitioners, we need to be increasingly concerned with basic security measures around our models and determining how much sensitive information has been exposed to our model. If we factor these into our evaluation criteria, we can hopefully find a nice balance between model success and privacy concerns. At KIProtect, we have evaluated our pseudonymization process using ML models and shown only a very small drop in accuracy (1-2%) for machine learning models trained on protected data; so we see this as not only possible, but essential for more secure and privacy-aware data science.

InfoQ Software Architects' Newsletter

Follow us on

Rate this Article

This content is in the Culture & Methods topic

Related Topics:

Related Editorial

Related Sponsors

Popular across InfoQ

The InfoQ Newsletter