Uber Open-Sources Plug-and-Play Language Model for Controlling AI-Generated Text

Uber AI open-sourced their plug-and-play language model (PPLM) which can control the topic and sentiment of AI-generated text. The model's output is evaluated by human judges as achieving 36% better topic accuracy compared to the baseline GPT-2 model.

The team provided a full description of the system and experiments in a paper published on arXiv. PPLM starts with a pre-trained language model (LM), such as GPT-2. These LMs can produce complex output which approaches human fluency, but it is difficult to control the specific properties of the generated text. Instead of "fine-tuning" the LM with additional training data, PPLM uses a separate attribute model that can evaluate the LM's output for sentiment or topic; this model is used to control the text produced by the LM. A strength parameter can tune how much the attribute model adjusts the LM output. According to Uber's researchers,

PPLM allows a user to flexibly plug in one or more simple attribute models representing the desired control objective into a large, unconditional LM. The method has the key property that it uses the LM as is—no training or fine-tuning is required—which enables researchers to leverage best-in-class LMs even if they do not have the extensive hardware required to train them.

Recent state-of-the-art NLP research has focused on creating pre-trained models based on the transformer architecture. These models are large, containing hundreds of millions of parameters, and are trained on large datasets containing millions of words; the training may take several days of runtime on expensive GPU hardware. Researchers without the resources to train their own state-of-the-art models must often choose to use a publicly available model that isn't quite suited for their task, or go with a smaller, less accurate model of their own. Another alternative is to fine-tune a pretrained model, but that presents the risk of catastrophic forgetting.

The key to PPLM is to use an additional, simpler model, the attribute model (AM), that can score the output of the LM; in particular, it calculates the probability that the LM's output text has some attribute (for example, that the text has positive sentiment, or is about politics). The AM can also calculate the gradient of that probability, which is used to "steer" the LM; the transformer-based LMs are "autoregressive," meaning that as they generate a sequence of words, the previously generated word becomes an input to the system for creating the next word. In PPLM, the gradient of the AM is also used to generate the next word, such that it is more likely to contain the desired attribute.

Uber highlighted the "pluggable" nature of PPLM with other techniques that require training and fine-tuning the full model. For example, a team from Google Brain presented a paper at last year's NeurIPS conference that uses a generative-adversarial technique made popular by deep-learning "style-transfer" image processing systems. OpenAI created a system that uses reinforcement learning (RL) to incorporate human feedback in fine-tuning a GPT-2 LM. On Hacker News, user Gwern Branwen wrote:

What's particularly nice [about PPLM] is if you can plug in a classifier for things like esthetics based on human ratings, along the lines of [OpenAI's system] but better - why spend the enormous effort running [RL] to brute force the classifier to obtain desired text or image output, when you can just backprop through it and let the classifier itself tell you how exactly to improve the inputs?

PPLM source code is available on GitHub. A demo is also available on NLP research site HuggingFace and via a Google Colab notebook.

InfoQ Software Architects' Newsletter

Write for InfoQ

Rate this Article

This content is in the AI, ML & Data Engineering topic

Related Topics:

Related Editorial

Related Sponsors

Popular across InfoQ

The InfoQ Newsletter