Allen Institute for AI Open-Sources AI Model Inspection Tool LM-Debugger

The Allen Institute for AI (AI2) open-sourced LM-Debugger, an interactive tool for interpreting and controlling the output of language model (LM) predictions. LM-Debugger supports any HuggingFace GPT-2 model and allows users to intervene in the text generation process by dynamically modifying updates in the hidden layers of the model's neural network.

The release was announced by AI2 researcher Mor Geva Pipek on the AI2 blog. Based on previous work by Geva and colleagues, LM-Debugger surfaces internal token representations and their updates by the hidden feed-forward layers of a Transformer network. In addition to supporting HuggingFace GPT-2 models, LM-Debugger can operate on other models "with only a few local modifications." Using the system's interactive UI, users can trace the updates of token representations through each layer and can influence the model's output by suppressing small sub-updates. In a set of experiments, the AI2 team showed that changing only 10 sub-updates in a GPT-2 can reduce the toxicity of its output by 50%; they also showed that the sub-updates can provide a signal for "early exit" in output generation, yielding an average computation savings of 20%. According to the team:

Our findings shed light on the prediction construction process in modern LMs, suggesting promising research directions for interpretability, control, and efficiency.

The Transformer architecture has become the de-facto standard for deep-learning natural language models. However, like most deep-learning models, it can be difficult to understand why the model produced a given output. This, coupled with concerns about toxic or misleading output, has led to an increased interest in understanding the inner workings of such models.

Geva and team recently published a paper investigating how certain Transformer components, the hidden feed-forward layers, contribute to the model's final output. They showed that these layers can be viewed as updates to the representation of the input tokens, and these updates can be viewed as a distribution over the output vocabulary. More specifically, each feed-forward layer can be decomposed into a set of value vectors that encode concepts and perform sub-updates that "promote" or strengthen the output probability of certain tokens. The output of the model can be steered toward a final output token by suppressing the promotion of undesired tokens.

Transformer Updates

Image source: https://github.com/mega002/lm-debugger

Using this insight, the AI2 team constructed a web-based UI that allows users to inspect and modify the sub-updates happening during output generation. The view shows the top 10 tokens in a layer's output distribution, with the option to "suppress" any of them. LM-Debugger also contains a search feature, which shows the top tokens it promotes. This allows users to analyze the concepts encoded by a value vector and to identify clusters of related value vectors.

Explainable AI systems are an active research topic. In 2019, InfoQ covered AI2's AllenNLP Interpret toolkit, which uses gradient-based methods for explaining the results from natural-language processing (NLP) models. InfoQ also covered an interactive visualization tool called exBERT, developed by MIT-IBM AI Labs and Harvard NLP Group. The tool lets users explore the representations learned by oncoder-only Transformer models such as BERT.

In a Twitter discussion about LM-Debugger, Geva replied to one user who asked if the system could be applied to BERT:

Yes, it is possible and could be very interesting to use LM-Debugger to analyze models like BERT too. I'm happy to guide you on how to do that in case you wish to implement that. We might add support for this in the future.

The LM-Debugger code is available on GitHub. Demo versions of the system for medium and large GPT-2 models are available on the AI2 website.

About the Author

Anthony Alford

Show moreShow less

InfoQ Software Architects' Newsletter

Write for InfoQ

About the Author

Anthony Alford

Rate this Article

This content is in the AI, ML & Data Engineering topic

Related Topics:

Related Editorial

Related Sponsors

Popular across InfoQ

The InfoQ Newsletter