BT

Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ

Topics

Choose your language

InfoQ Homepage News Google Presents MultiModel: A Neural Network Capable of Learning Multiple Tasks in Multiple Domains

Google Presents MultiModel: A Neural Network Capable of Learning Multiple Tasks in Multiple Domains

Lire ce contenu en français

Google created an algorithm that can take inputs from multiple modalities and can generate output in multiple modalities.

Currently, many machine learning applications focus on one domain. Machine translation builds models for one language pair, and image recognition algorithms only perform one task (e.g. describe an image, say what category an image belongs to, or find objects in the image). However, our brain performs very well on all tasks and transfers knowledge from one domain to another. The brain can even transfer what we learned by listening to other domains: things we see or read.

Google built a model that performs 8 tasks in multiple domains: speech recognition, image classification and captioning, sentence parsing, and back and forth translation of English-German and English-French. It consists of an encoder, decoder, and an "input-output mixer" that feeds previous input and output to the decoder. In the image below, each "petal" indicates a modality (either sound, text, or an image). The network can learn every task with one of these inputs and output modalities.

In November 2016 Google published zero-shot translation. This algorithm maps all sentences to "interlingua": a representation of the sentence, that is the same for each input language and output language. By only training English-Korean and English-Japanese language pairs, their neural network was able to translate Japanese-Korean without ever seeing such a sentence.

Google reports that tasks with small amounts of training data perform better when they use MultiModel. Machine learning models usually perform better with more training data. Using MultiModel you can get extra data from multiple domains. Note that using this approach they did not break any existing records on standard tasks.

MultiModel is open-sourced as part of the Tensor2Tensor library on GitHub. A paper detailing methods and results can be found on arxiv.com.

Rate this Article

Adoption
Style

BT