Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ


Choose your language

InfoQ Homepage News MIT Researchers Use Explainable AI Model to Discover New Antibiotics

MIT Researchers Use Explainable AI Model to Discover New Antibiotics

Researchers from MIT's Collins Lab used an explainable deep-learning model to discover chemical compounds which could fight the MRSA bacteria. The model uses graph algorithms to identify chemical compounds which are likely to have antibiotic properties. Additional models predict whether or not the chemicals would be harmful to humans.

Using a training dataset of 39,312 molecules and their experimentally-observed antibiotic effect, the team trained an ensemble of 20 models using the Chemprop framework; given a new molecule, the trained ensemble can predict its likely antibiotic effect. They also created another set of models which predicts a molecule's cytotoxicity in three different types of human cells: liver carcinoma, skeletal muscle, and lung fibroblast cells. By applying these models to over 12M compounds in a molecule database, the team created a list of 3,600 compounds that were likely to be antibiotic yet unlikely to be toxic.  They then used a Monte Carlo tree search method to create explanations or rationales for the molecules' antibiotic effects based on their chemical substructures, which can point to a whole class of compounds that could be candidates for antibiotics. According to lab director James Collins:

The insight here was that we could see what was being learned by the models to make their predictions that certain molecules would make for good antibiotics. Our work provides a framework that is time-efficient, resource-efficient, and mechanistically insightful, from a chemical-structure standpoint, in ways that we haven’t had to date.

Recently, AI models have been used to solve several problems in chemistry and biology. In 2020 the Collins Lab used an AI model to identify a known compound, halicin, as a potential antibiotic. InfoQ has covered other applications, such as DeepMind's AlphaFold2, which solved a long-standing Protein Structure Prediction challenge, and Meta's ESMFold, an AI model for predicting protein structure from a sequence of genes.

The latest work from the Collins lab uses Chemprop, an open-source deep-learning library for predicting molecular properties. The team also used another open-source library, RDKit, to compute additional input features, such as the molecule's number of hydrogen acceptors. To gather the training data for their antibiotic prediction model, the researchers grew cultures of S. aureus bacteria and added concentrations of each of the 39,312 compounds to determine which inhibited the growth of the bacteria. This identified 512 compounds that were labeled as antibiotic in the dataset.

Andrew Ng's AI newsletter The Batch highlighted the research, writing:

Antibiotic-resistant infections are among the top global public health threats directly responsible for 1.27 million deaths in 2019, according to the World Health Organization. New options, as well as efforts to fight the emergence of resistant strains, are needed.

In a discussion about the work on Reddit, one user wrote

That's the one great thing that AI is good for, combing through lots of data and finding patterns. It will be exciting when this is applied more to more medicine, health, and space research.

The MIT team open-sourced their model code and checkpoints as well as the training data; all are available on GitHub.

About the Author

Rate this Article