BT

New Early adopter or innovator? InfoQ has been working on some new features for you. Learn more

Google Researcher Invented New Technology to Bring Neural Networks to Mobile Devices

| by Roland Meertens Follow 3 Followers on Aug 26, 2017. Estimated reading time: 1 minute | NOTICE: QCon.ai - Applied AI conference for Developers Apr 9-11, 2018, San Francisco. Join us!

Recently many companies released applications that use deep neural networks. Neural networks can require a large amount of computation. For this reason, they often appear in software as a service applications on GPU powered servers. For applications that should run without internet access, must be fast and responsible, or in which privacy is a concern, using networks on servers is not possible.

Lately, multiple companies announced that they are working on putting neural networks on mobile devices. Apple already announced their CoreML platform at WWDC 2017. Google is working on a version of their popular Tensorflow toolkit for mobile devices called "Tensorflow Lite". Google also released several pre-trained image recognition models, where developers can choose their tradeoff between efficiency and accuracy.

Although developers can run their networks on mobile devices, they are still limited in the options for building faster applications with neural networks. One option is reducing reduce the size of their network, which often comes with an accuracy loss. Another option is training a full network and reducing the floating point precision after training. With this option, it's difficult to estimate what effect this will have on your performance. There are also some older techniques like "Optimal Brain Damage", invented by director of AI research at Facebook, Yann Le Cun. None of these methods to optimise network inference became very popular.

Google researcher Sujith Ravi came up with a novel idea: co-train two neural networks. One network is a full neural network, called the trainer network. The other network, called projection network, is a network that tries to represent the input and intermediate representations of the trainer network in a low-memory representation. To do this it uses efficient functions. Both networks are trained at the same time and share the same loss function. In this way the projection network learns from the trainer network. When both neural networks are ready to be used, the large network can remain on the server while users can download the small efficient network on their smartphone.

The paper of Sujith Ravis is available on the Arxiv. The paper also contains discussions on how many bits you need to score well on several famous data sets.

Rate this Article

Adoption Stage
Style

Hello stranger!

You need to Register an InfoQ account or or login to post comments. But there's so much more behind being registered.

Get the most out of the InfoQ experience.

Tell us what you think

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread
Community comments

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

Discuss

Login to InfoQ to interact with what matters most to you.


Recover your password...

Follow

Follow your favorite topics and editors

Quick overview of most important highlights in the industry and on the site.

Like

More signal, less noise

Build your own feed by choosing topics you want to read about and editors you want to hear from.

Notifications

Stay up-to-date

Set up your notifications and don't miss out on content that matters to you

BT