BT

Ready for InfoQ 3.0? Try the new design and let us know what you think!

Google Researcher Invented New Technology to Bring Neural Networks to Mobile Devices

| by Roland Meertens Follow 8 Followers on Aug 26, 2017. Estimated reading time: 1 minute | NOTICE: QCon.ai - Applied AI for Developers Apr 15 - 17, 2019, San Francisco. Save an extra $80 with INFOQAI19!

Recently many companies released applications that use deep neural networks. Neural networks can require a large amount of computation. For this reason, they often appear in software as a service applications on GPU powered servers. For applications that should run without internet access, must be fast and responsible, or in which privacy is a concern, using networks on servers is not possible.

Lately, multiple companies announced that they are working on putting neural networks on mobile devices. Apple already announced their CoreML platform at WWDC 2017. Google is working on a version of their popular Tensorflow toolkit for mobile devices called "Tensorflow Lite". Google also released several pre-trained image recognition models, where developers can choose their tradeoff between efficiency and accuracy.

Although developers can run their networks on mobile devices, they are still limited in the options for building faster applications with neural networks. One option is reducing reduce the size of their network, which often comes with an accuracy loss. Another option is training a full network and reducing the floating point precision after training. With this option, it's difficult to estimate what effect this will have on your performance. There are also some older techniques like "Optimal Brain Damage", invented by director of AI research at Facebook, Yann Le Cun. None of these methods to optimise network inference became very popular.

Google researcher Sujith Ravi came up with a novel idea: co-train two neural networks. One network is a full neural network, called the trainer network. The other network, called projection network, is a network that tries to represent the input and intermediate representations of the trainer network in a low-memory representation. To do this it uses efficient functions. Both networks are trained at the same time and share the same loss function. In this way the projection network learns from the trainer network. When both neural networks are ready to be used, the large network can remain on the server while users can download the small efficient network on their smartphone.

The paper of Sujith Ravis is available on the Arxiv. The paper also contains discussions on how many bits you need to score well on several famous data sets.

Rate this Article

Adoption Stage
Style

Hello stranger!

You need to Register an InfoQ account or or login to post comments. But there's so much more behind being registered.

Get the most out of the InfoQ experience.

Tell us what you think

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread
Community comments

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

Allowed html: a,b,br,blockquote,i,li,pre,u,ul,p

Email me replies to any of my messages in this thread

Discuss
BT