BT

New Early adopter or innovator? InfoQ has been working on some new features for you. Learn more

Google Presents MultiModel: A Neural Network Capable of Learning Multiple Tasks in Multiple Domains

| by Roland Meertens on Jul 12, 2017. Estimated reading time: 1 minute |

Google created an algorithm that can take inputs from multiple modalities and can generate output in multiple modalities.

Currently, many machine learning applications focus on one domain. Machine translation builds models for one language pair, and image recognition algorithms only perform one task (e.g. describe an image, say what category an image belongs to, or find objects in the image). However, our brain performs very well on all tasks and transfers knowledge from one domain to another. The brain can even transfer what we learned by listening to other domains: things we see or read.

Google built a model that performs 8 tasks in multiple domains: speech recognition, image classification and captioning, sentence parsing, and back and forth translation of English-German and English-French. It consists of an encoder, decoder, and an "input-output mixer" that feeds previous input and output to the decoder. In the image below, each "petal" indicates a modality (either sound, text, or an image). The network can learn every task with one of these inputs and output modalities.

In November 2016 Google published zero-shot translation. This algorithm maps all sentences to "interlingua": a representation of the sentence, that is the same for each input language and output language. By only training English-Korean and English-Japanese language pairs, their neural network was able to translate Japanese-Korean without ever seeing such a sentence.

Google reports that tasks with small amounts of training data perform better when they use MultiModel. Machine learning models usually perform better with more training data. Using MultiModel you can get extra data from multiple domains. Note that using this approach they did not break any existing records on standard tasks.

MultiModel is open-sourced as part of the Tensor2Tensor library on GitHub. A paper detailing methods and results can be found on arxiv.com.

Rate this Article

Adoption Stage
Style

Hello stranger!

You need to Register an InfoQ account or or login to post comments. But there's so much more behind being registered.

Get the most out of the InfoQ experience.

Tell us what you think

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread
Community comments

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

Discuss

Login to InfoQ to interact with what matters most to you.


Recover your password...

Follow

Follow your favorite topics and editors

Quick overview of most important highlights in the industry and on the site.

Like

More signal, less noise

Build your own feed by choosing topics you want to read about and editors you want to hear from.

Notifications

Stay up-to-date

Set up your notifications and don't miss out on content that matters to you

BT